You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jeff Williams <je...@wherethebitsroam.com> on 2012/05/03 20:29:41 UTC

Re: Write performance compared to Postgresql

Just to follow this up, I repeated the test with a multi-threaded java (Hector) client and was able to get much better performance - 10,000 rows in just over a second. So it looks like the client latency was the killer and I have since read that the ruby thrift implementation is not the fastest.

On Apr 4, 2012, at 9:11 AM, Jeff Williams wrote:

> On three machines on the same subnet as the two cassandra nodes.
> 
> On Apr 3, 2012, at 6:40 PM, Collard, David L (Dave) wrote:
> 
>> Where is your client running?
>> 
>> -----Original Message-----
>> From: Jeff Williams [mailto:jeffw@wherethebitsroam.com] 
>> Sent: Tuesday, April 03, 2012 11:09 AM
>> To: user@cassandra.apache.org
>> Subject: Re: Write performance compared to Postgresql
>> 
>> Vitalii,
>> 
>> Yep, that sounds like a good idea. Do you have any more information about how you're doing that? Which client?
>> 
>> Because even with 3 concurrent client nodes, my single postgresql server is still out performing my 2 node cassandra cluster, although the gap is narrowing.
>> 
>> Jeff
>> 
>> On Apr 3, 2012, at 4:08 PM, Vitalii Tymchyshyn wrote:
>> 
>>> Note that having tons of TCP connections is not good. We are using async client to issue multiple calls over single connection at same time. You can do the same.
>>> 
>>> Best regards, Vitalii Tymchyshyn.
>>> 
>>> 03.04.12 16:18, Jeff Williams написав(ла):
>>>> Ok, so you think the write speed is limited by the client and protocol, rather than the cassandra backend? This sounds reasonable, and fits with our use case, as we will have several servers writing. However, a bit harder to test!
>>>> 
>>>> Jeff
>>>> 
>>>> On Apr 3, 2012, at 1:27 PM, Jake Luciani wrote:
>>>> 
>>>>> Hi Jeff,
>>>>> 
>>>>> Writing serially over one connection will be slower. If you run many threads hitting the server at once you will see throughput improve.
>>>>> 
>>>>> Jake
>>>>> 
>>>>> 
>>>>> 
>>>>> On Apr 3, 2012, at 7:08 AM, Jeff Williams<je...@wherethebitsroam.com>  wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I am looking at cassandra for a logging application. We currently log to a Postgresql database.
>>>>>> 
>>>>>> I set up 2 cassandra servers for testing. I did a benchmark where I had 100 hashes representing logs entries, read from a json file. I then looped over these to do 10,000 log inserts. I repeated the same writing to a postgresql instance on one of the cassandra servers. The script is attached. The cassandra writes appear to perform a lot worse. Is this expected?
>>>>>> 
>>>>>> jeff@transcoder01:~$ ruby cassandra-bm.rb
>>>>>> cassandra
>>>>>> 3.170000   0.480000   3.650000 ( 12.032212)
>>>>>> jeff@transcoder01:~$ ruby cassandra-bm.rb
>>>>>> postgres
>>>>>> 2.140000   0.330000   2.470000 (  7.002601)
>>>>>> 
>>>>>> Regards,
>>>>>> Jeff
>>>>>> 
>>>>>> <cassandra-bm.rb>
>>> 
>> 
>