You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Danny Chan <to...@gmail.com> on 2014/09/09 21:10:51 UTC

Quickly loading C* dataset into memory (row cache)

Hello all,

Is there a method to quickly load a large dataset into the row cache?
I use row caching as I want the entire dataset to be in memory.

I'm running a Cassandra-1.2 database server with a dataset of 5550000
records (6GB size) and a row cache of 6GB. Key caching is disabled and
I am using SerializingCacheProvider. The machine running the Cassandra
server has 7GB memory and 2 CPUs.

I have a YCSB client running on another machine and it runs a readonly
benchmark on the Cassandra server. As the benchmark progresses, the
Cassandra server loads the dataset into the row cache.

However, it takes up to 2 hours to load the entire dataset into the row cache.

Is there any other method to load the entire dataset into row cache
quickly (does not need to use YCSB)?


Any help is appreciated,

Danny

Re: Quickly loading C* dataset into memory (row cache)

Posted by Robert Coli <rc...@eventbrite.com>.

On Sat, Sep 13, 2014 at 11:48 PM, Paulo Ricardo Motta Gomes <
paulo.motta@chaordicsystems.com> wrote:

> Apparently Apple is using Cassandra as a massive multi-DC cache, as per
> their announcement during the summit, but probably DSE with in-memory
> enabled option. Would love to hear about similar use cases.
>

There's caches and there's caches. I submit that, thus far, the usage of
the term "cache" in this conversation has not been specific enough to
enhance understanding.

I continue to assert, in a very limited scope, that 6GB of row cache in
Cassandra on a system with 7GB of RAM is Doing It Wrong.  :D

=Rob

Re: Quickly loading C* dataset into memory (row cache)

Posted by Paulo Ricardo Motta Gomes <pa...@chaordicsystems.com>.

Apparently Apple is using Cassandra as a massive multi-DC cache, as per
their announcement during the summit, but probably DSE with in-memory
enabled option. Would love to hear about similar use cases.

On Fri, Sep 12, 2014 at 12:20 PM, Ken Hancock <ke...@schange.com>
wrote:

> +1 for Redis.
>
> It's really nice, good primitives, and then you can do some really cool
> stuff chaining multiple atomic operations to create larger atomics through
> the lua scripting.
>
> On Thu, Sep 11, 2014 at 12:26 PM, Robert Coli <rc...@eventbrite.com>
> wrote:
>
>> On Thu, Sep 11, 2014 at 8:30 AM, Danny Chan <to...@gmail.com> wrote:
>>
>>> What are you referring to when you say memory store?
>>>
>>> RAM disk? memcached?
>>>
>>
>> In 2014, probably Redis?
>>
>> =Rob
>>
>>
>
>
>
> --
> *Ken Hancock *| System Architect, Advanced Advertising
> SeaChange International
> 50 Nagog Park
> Acton, Massachusetts 01720
> ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
> <http://www.schange.com/en-US/Company/InvestorRelations.aspx>
> Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hancock@schange.com
>  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn]
> <http://www.linkedin.com/in/kenhancock>
>
> [image: SeaChange International]
> <http://www.schange.com/>This e-mail and any attachments may contain
> information which is SeaChange International confidential. The information
> enclosed is intended only for the addressees herein and may not be copied
> or forwarded without permission from SeaChange International.
>



-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br <http://www.chaordic.com.br/>*
+55 48 3232.3200

Re: Quickly loading C* dataset into memory (row cache)

Posted by Ken Hancock <ke...@schange.com>.

+1 for Redis.

It's really nice, good primitives, and then you can do some really cool
stuff chaining multiple atomic operations to create larger atomics through
the lua scripting.

On Thu, Sep 11, 2014 at 12:26 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, Sep 11, 2014 at 8:30 AM, Danny Chan <to...@gmail.com> wrote:
>
>> What are you referring to when you say memory store?
>>
>> RAM disk? memcached?
>>
>
> In 2014, probably Redis?
>
> =Rob
>
>

-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
<http://www.schange.com/en-US/Company/InvestorRelations.aspx>
Office: +1 (978) 889-3329 | [image: Google Talk:]
ken.hancock@schange.com | [image:
Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn]
<http://www.linkedin.com/in/kenhancock>

[image: SeaChange International]
<http://www.schange.com/>This e-mail and any attachments may contain
information which is SeaChange International confidential. The information
enclosed is intended only for the addressees herein and may not be copied
or forwarded without permission from SeaChange International.

Re: Quickly loading C* dataset into memory (row cache)

Posted by Robert Coli <rc...@eventbrite.com>.

On Thu, Sep 11, 2014 at 8:30 AM, Danny Chan <to...@gmail.com> wrote:

> What are you referring to when you say memory store?
>
> RAM disk? memcached?
>

In 2014, probably Redis?

=Rob

Re: Quickly loading C* dataset into memory (row cache)

Posted by Danny Chan <to...@gmail.com>.

What are you referring to when you say memory store?

RAM disk? memcached?

Thanks,

Danny

On Wed, Sep 10, 2014 at 1:11 AM, DuyHai Doan <do...@gmail.com> wrote:
> Rob Coli strikes again, you're Doing It Wrong, and he's right :D
>
> Using Cassandra as an distributed cache is a bad idea, seriously. Putting
> 6GB into row cache is another one.
>
>
> On Tue, Sep 9, 2014 at 9:21 PM, Robert Coli <rc...@eventbrite.com> wrote:
>>
>> On Tue, Sep 9, 2014 at 12:10 PM, Danny Chan <to...@gmail.com> wrote:
>>>
>>> Is there a method to quickly load a large dataset into the row cache?
>>> I use row caching as I want the entire dataset to be in memory.
>>
>>
>> You're doing it wrong. Use a memory store.
>>
>> =Rob
>>
>
>

Re: Quickly loading C* dataset into memory (row cache)

Posted by DuyHai Doan <do...@gmail.com>.

Rob Coli strikes again, you're Doing It Wrong, and he's right :D

Using Cassandra as an distributed cache is a bad idea, seriously. Putting
6GB into row cache is another one.

On Tue, Sep 9, 2014 at 9:21 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Tue, Sep 9, 2014 at 12:10 PM, Danny Chan <to...@gmail.com> wrote:
>
>> Is there a method to quickly load a large dataset into the row cache?
>> I use row caching as I want the entire dataset to be in memory.
>>
>
> You're doing it wrong. Use a memory store.
>
> =Rob
>
>

Re: Quickly loading C* dataset into memory (row cache)

Posted by Robert Coli <rc...@eventbrite.com>.

On Tue, Sep 9, 2014 at 12:10 PM, Danny Chan <to...@gmail.com> wrote:

> Is there a method to quickly load a large dataset into the row cache?
> I use row caching as I want the entire dataset to be in memory.
>

You're doing it wrong. Use a memory store.

=Rob