You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by SHANKAR REDDY <sa...@gmail.com> on 2015/06/09 10:27:02 UTC

Cassandra Insert Rate

Team,
I have a sample insert query which loads around 10 million records and
found that the insert rate is around 1500 per second.  This is very slow.

The Source code I am using available at the below location. I am using the
very latest version 2.1.6 with default seetings  and single node VM machine
with 20GM RAM and 100 GM SSD disk.

https://github.com/shankar-reddy/CassandraSandbox/blob/master/src/main/java/com/itreddys/cassandra/example/BulkLoadTest.java

Please suggest on insert rate improvement.

-Shankar

Re: Cassandra Insert Rate

Posted by SHANKAR REDDY <sa...@gmail.com>.
Thanks for BR for the quick response on this and Appreciate it.

That helps for Batch load.

If 10 million users are inserting new records at a time ( 1 record for one
user ) then how  do we increase the same. My sample program assumes request
from 10 million records.

-Shankar


On Tue, Jun 9, 2015 at 1:56 AM, Marcus Olsson <ma...@ericsson.com>
wrote:

>  Hi Shankar,
>
> I would say:
> * Prepared statements to avoid sending the whole statement with every
> query and instead just send the values.
> * Using session.executeAsync() to improve concurrency.
>
> So you would start by creating a prepared statement, something like:
>
> PreparedStatement ps = session.prepare("INSERT INTO ks.tb
> (key1,data1,data2) VALUES (?,?,?)"); // Only done once
>
> And then in loadData():
> session.executeAsync(ps.bind("key", "1", "2"));
>
> The executeAsync() does not wait for a response for the query, so that
> should probably be done elsewhere(see the link below for how you can get
> the results back).
>
> http://www.datastax.com/dev/blog/java-driver-async-queries
>
> BR
> Marcus Olsson
>
>
> On 06/09/2015 10:27 AM, SHANKAR REDDY wrote:
>
> Team,
> I have a sample insert query which loads around 10 million records and
> found that the insert rate is around 1500 per second.  This is very slow.
>
>  The Source code I am using available at the below location. I am using
> the very latest version 2.1.6 with default seetings  and single node VM
> machine with 20GM RAM and 100 GM SSD disk.
>
>
> https://github.com/shankar-reddy/CassandraSandbox/blob/master/src/main/java/com/itreddys/cassandra/example/BulkLoadTest.java
>
>  Please suggest on insert rate improvement.
>
>  -Shankar
>
>
>

Re: Cassandra Insert Rate

Posted by Marcus Olsson <ma...@ericsson.com>.
Hi Shankar,

I would say:
* Prepared statements to avoid sending the whole statement with every 
query and instead just send the values.
* Using session.executeAsync() to improve concurrency.

So you would start by creating a prepared statement, something like:

PreparedStatement ps = session.prepare("INSERT INTO ks.tb 
(key1,data1,data2) VALUES (?,?,?)"); // Only done once

And then in loadData():
session.executeAsync(ps.bind("key", "1", "2"));

The executeAsync() does not wait for a response for the query, so that 
should probably be done elsewhere(see the link below for how you can get 
the results back).

http://www.datastax.com/dev/blog/java-driver-async-queries

BR
Marcus Olsson

On 06/09/2015 10:27 AM, SHANKAR REDDY wrote:
> Team,
> I have a sample insert query which loads around 10 million records and 
> found that the insert rate is around 1500 per second.  This is very slow.
>
> The Source code I am using available at the below location. I am using 
> the very latest version 2.1.6 with default seetings  and single node 
> VM machine with 20GM RAM and 100 GM SSD disk.
>
> https://github.com/shankar-reddy/CassandraSandbox/blob/master/src/main/java/com/itreddys/cassandra/example/BulkLoadTest.java
>
> Please suggest on insert rate improvement.
>
> -Shankar
>