You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by ChingShen <ch...@gmail.com> on 2010/09/02 08:09:54 UTC

about insert benchmark

Hi all,

  I run a benchmark with my own code and found that the 100000 inserts
performance is better than others, Why?
 Can anyone explain it?

Thanks.

Partitioner = OPP
CL = ONE
==============================================
1000 records
insert one:201 ms
insert per:0.201 ms
insert thput:4975.1245 ops/sec
==============================================
10000 records
insert one:1950 ms
insert per:0.195 ms
insert thput:5128.205 ops/sec
==============================================
100000 records
insert one:15576 ms
insert per:0.15576 ms
insert thput:6420.134 ops/sec
==============================================
500000 records
insert one:82177 ms
insert per:0.164354 ms
insert thput:6084.4272 ops/sec

Shen

Re: about insert benchmark

Posted by Terje Marthinussen <tm...@gmail.com>.
1000 and 10000 records take too short time to really benchmark anything. You
will use 2 seconds just for stuff like tcp_windows sizes to adjust to the
level were you get throughput.

The difference between 100k and 500k is less than 10%. Could be anything.

Filesystem caches, sizes of memtables (default memcache settings flushes a
memtable when it reaches 300k entries)... difficult to say.

You should benchmark something larger than that. Need to at least to trigger
some SSTable compactions and proper Java GC work if you really want to know
what your performance is.

Terje

On Thu, Sep 2, 2010 at 4:08 PM, Thorvaldsson Justus <
justus.thorvaldsson@svenskaspel.se> wrote:

>  Batchmutate insert? Can be package size that differ if not nr threads
> sending data to Cassandra nodes.
>
>
>
> *Från:* ChingShen [mailto:chingshenchen@gmail.com]
> *Skickat:* den 2 september 2010 08:59
> *Till:* user@cassandra.apache.org
> *Ämne:* Re: about insert benchmark
>
>
>
> Hi Daniel,
>
>    I have 4 nodes in my cluster, and run a benchmark on node A in Java.
>   P.S. Replication = 3
>
> Shen
>
> On Thu, Sep 2, 2010 at 2:49 PM, vineet daniel <vi...@gmail.com>
> wrote:
>
> Hi Ching
>
> You are inserting using php,perl,python,java or ? and is cassandra
> installed locally or on a network system and is it a single system or you
> have a cluster of nodes. I know I've asked you many questions but the
> answers will help immensely to assess the results.
>
> Anyways congrats on getting better results :-) .
>
>
> _______________________________________
> Regards
> Vineet Daniel
> +918106217121
> _______________________________________
>
> Let your email find you....
>
>    On Thu, Sep 2, 2010 at 11:39 AM, ChingShen <ch...@gmail.com>
> wrote:
>
> Hi all,
>
>   I run a benchmark with my own code and found that the 100000 inserts
> performance is better than others, Why?
>  Can anyone explain it?
>
> Thanks.
>
> Partitioner = OPP
> CL = ONE
> ==============================================
> 1000 records
> insert one:201 ms
> insert per:0.201 ms
> insert thput:4975.1245 ops/sec
> ==============================================
> 10000 records
> insert one:1950 ms
> insert per:0.195 ms
> insert thput:5128.205 ops/sec
> ==============================================
> 100000 records
> insert one:15576 ms
> insert per:0.15576 ms
> insert thput:6420.134 ops/sec
> ==============================================
> 500000 records
> insert one:82177 ms
> insert per:0.164354 ms
> insert thput:6084.4272 ops/sec
>
> Shen
>
>
>
>
>

SV: about insert benchmark

Posted by Thorvaldsson Justus <ju...@svenskaspel.se>.
Batchmutate insert? Can be package size that differ if not nr threads sending data to Cassandra nodes.

Från: ChingShen [mailto:chingshenchen@gmail.com]
Skickat: den 2 september 2010 08:59
Till: user@cassandra.apache.org
Ämne: Re: about insert benchmark

Hi Daniel,

   I have 4 nodes in my cluster, and run a benchmark on node A in Java.
  P.S. Replication = 3

Shen
On Thu, Sep 2, 2010 at 2:49 PM, vineet daniel <vi...@gmail.com>> wrote:
Hi Ching

You are inserting using php,perl,python,java or ? and is cassandra installed locally or on a network system and is it a single system or you have a cluster of nodes. I know I've asked you many questions but the answers will help immensely to assess the results.

Anyways congrats on getting better results :-) .

_______________________________________
Regards
Vineet Daniel
+918106217121
_______________________________________

Let your email find you....

On Thu, Sep 2, 2010 at 11:39 AM, ChingShen <ch...@gmail.com>> wrote:
Hi all,

  I run a benchmark with my own code and found that the 100000 inserts performance is better than others, Why?
 Can anyone explain it?

Thanks.

Partitioner = OPP
CL = ONE
==============================================
1000 records
insert one:201 ms
insert per:0.201 ms
insert thput:4975.1245 ops/sec
==============================================
10000 records
insert one:1950 ms
insert per:0.195 ms
insert thput:5128.205 ops/sec
==============================================
100000 records
insert one:15576 ms
insert per:0.15576 ms
insert thput:6420.134 ops/sec
==============================================
500000 records
insert one:82177 ms
insert per:0.164354 ms
insert thput:6084.4272 ops/sec

Shen



Re: about insert benchmark

Posted by ChingShen <ch...@gmail.com>.
Sorry, my Cassandra version is 0.6.4.

Re: about insert benchmark

Posted by Aaron Morton <aa...@thelastpickle.com>.
Are you running all of the inserts through one node or distributing the connections around the cluster?

You are using the order preserving partioner, so the load around the cluster will be highly dependant on the keys you send. Are they evenly distributed?

The JVM will tune the hot spots the longer the process is running.

The throughput difference between 100000 and 500000 is less than %1.   

All seems fine.
Aaron
On 2 Sep 2010, at 18:59, ChingShen <ch...@gmail.com> wrote:

> Hi Daniel,
> 
>    I have 4 nodes in my cluster, and run a benchmark on node A in Java.
>   P.S. Replication = 3
> 
> Shen
> 
> On Thu, Sep 2, 2010 at 2:49 PM, vineet daniel <vi...@gmail.com> wrote:
> Hi Ching
> 
> You are inserting using php,perl,python,java or ? and is cassandra installed locally or on a network system and is it a single system or you have a cluster of nodes. I know I've asked you many questions but the answers will help immensely to assess the results. 
> 
> Anyways congrats on getting better results :-) .
> 
> _______________________________________
> Regards
> Vineet Daniel
> +918106217121
> _______________________________________
> 
> Let your email find you....
> 
> 
> On Thu, Sep 2, 2010 at 11:39 AM, ChingShen <ch...@gmail.com> wrote:
> Hi all,
> 
>   I run a benchmark with my own code and found that the 100000 inserts performance is better than others, Why?
>  Can anyone explain it?
> 
> Thanks.
> 
> Partitioner = OPP
> CL = ONE
> ==============================================
> 1000 records
> insert one:201 ms
> insert per:0.201 ms
> insert thput:4975.1245 ops/sec
> ==============================================
> 10000 records
> insert one:1950 ms
> insert per:0.195 ms
> insert thput:5128.205 ops/sec
> ==============================================
> 100000 records
> insert one:15576 ms
> insert per:0.15576 ms
> insert thput:6420.134 ops/sec
> ==============================================
> 500000 records
> insert one:82177 ms
> insert per:0.164354 ms
> insert thput:6084.4272 ops/sec
> 
> Shen
> 
> 

Re: about insert benchmark

Posted by ChingShen <ch...@gmail.com>.
Hi Daniel,

   I have 4 nodes in my cluster, and run a benchmark on node A in Java.
  P.S. Replication = 3

Shen

On Thu, Sep 2, 2010 at 2:49 PM, vineet daniel <vi...@gmail.com>wrote:

> Hi Ching
>
> You are inserting using php,perl,python,java or ? and is cassandra
> installed locally or on a network system and is it a single system or you
> have a cluster of nodes. I know I've asked you many questions but the
> answers will help immensely to assess the results.
>
> Anyways congrats on getting better results :-) .
>
> _______________________________________
> Regards
> Vineet Daniel
> +918106217121
> _______________________________________
>
> Let your email find you....
>
>
> On Thu, Sep 2, 2010 at 11:39 AM, ChingShen <ch...@gmail.com>wrote:
>
>> Hi all,
>>
>>   I run a benchmark with my own code and found that the 100000 inserts
>> performance is better than others, Why?
>>  Can anyone explain it?
>>
>> Thanks.
>>
>> Partitioner = OPP
>> CL = ONE
>> ==============================================
>> 1000 records
>> insert one:201 ms
>> insert per:0.201 ms
>> insert thput:4975.1245 ops/sec
>> ==============================================
>> 10000 records
>> insert one:1950 ms
>> insert per:0.195 ms
>> insert thput:5128.205 ops/sec
>> ==============================================
>> 100000 records
>> insert one:15576 ms
>> insert per:0.15576 ms
>> insert thput:6420.134 ops/sec
>> ==============================================
>> 500000 records
>> insert one:82177 ms
>> insert per:0.164354 ms
>> insert thput:6084.4272 ops/sec
>>
>> Shen
>>
>
>

Re: about insert benchmark

Posted by vineet daniel <vi...@gmail.com>.
Hi Ching

You are inserting using php,perl,python,java or ? and is cassandra installed
locally or on a network system and is it a single system or you have a
cluster of nodes. I know I've asked you many questions but the answers will
help immensely to assess the results.

Anyways congrats on getting better results :-) .
_______________________________________
Regards
Vineet Daniel
+918106217121
_______________________________________

Let your email find you....


On Thu, Sep 2, 2010 at 11:39 AM, ChingShen <ch...@gmail.com> wrote:

> Hi all,
>
>   I run a benchmark with my own code and found that the 100000 inserts
> performance is better than others, Why?
>  Can anyone explain it?
>
> Thanks.
>
> Partitioner = OPP
> CL = ONE
> ==============================================
> 1000 records
> insert one:201 ms
> insert per:0.201 ms
> insert thput:4975.1245 ops/sec
> ==============================================
> 10000 records
> insert one:1950 ms
> insert per:0.195 ms
> insert thput:5128.205 ops/sec
> ==============================================
> 100000 records
> insert one:15576 ms
> insert per:0.15576 ms
> insert thput:6420.134 ops/sec
> ==============================================
> 500000 records
> insert one:82177 ms
> insert per:0.164354 ms
> insert thput:6084.4272 ops/sec
>
> Shen
>