You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ajay <aj...@gmail.com> on 2015/01/29 07:50:27 UTC

Performance difference between Regular Statement Vs PreparedStatement

Hi All,

I tried both insert and select query (using QueryBuilder) in Regular
statement and PreparedStatement in a multithreaded code to do the query say
10k to 50k times. But I don't see any visible improvement using the
PreparedStatement. What could be the reason?

Note : I am using the same Session object in multiple threads.

Cassandra version : 2.0.11
Driver version : 2.1.4

Thanks
Ajay

Re: Performance difference between Regular Statement Vs PreparedStatement

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
+1 to everything Eric said.

The penalty of not using token aware routing increases as you add nodes,
load, and network overhead.  This is kind of like batch statements.  People
use them in dev, with 1 node, and think they're great to help with
performance.  But when you put them in production... not so much.

On Thu Jan 29 2015 at 10:22:17 AM Eric Stevens <mi...@gmail.com> wrote:

> That's not a particularly good setup for load testing, I would try hard
> not to draw any conclusions from it.  Most likely your biggest bottleneck
> is I/O in your VM's, and any savings from using prepared statements dwarf
> in comparison to the price of virtualization.  Point 2's effects are also
> minimized because of being all-local.  This is fine for operational
> practice, but not much use beyond that.
>
> If you want to do local testing, I would recommend a tool called CCM:
> https://github.com/pcmanus/ccm
>
> This tool can easily set up and run multiple instances in your host OS
> without having the overhead of virtualization.  Your performance tests will
> be truer, but still hard to draw much meaningful insight from.
>
> On Thu, Jan 29, 2015 at 9:19 AM, Ajay <aj...@gmail.com> wrote:
>
>> Thanks Eric. I didn't know the point about the token aware routing.
>>
>> But with points 2 and 3 I didn't notice much improvement with prepared
>> statement.  I have 2 cassandra nodes running in virtual boxes in the same
>> machine and test client running in the same machine.
>>
>> Thanks
>> Ajay
>> Prepared statements can take advantage of token aware routing which IIRC
>> non-prepared statements cannot in the DS Java Driver, so as your cluster
>> grows you reduce the overhead of statement coordination (assuming you use
>> token aware routing).  There should also be less data to transfer for
>> shipping the query (the CQL portion is shipped once during the prepare
>> stage, and only the data is shipped on subsequent executions).  You'll also
>> save the cluster the overhead of repeatedly parsing your CQL statements.
>>
>> On Wed, Jan 28, 2015 at 11:50 PM, Ajay <aj...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I tried both insert and select query (using QueryBuilder) in Regular
>>> statement and PreparedStatement in a multithreaded code to do the query say
>>> 10k to 50k times. But I don't see any visible improvement using the
>>> PreparedStatement. What could be the reason?
>>>
>>> Note : I am using the same Session object in multiple threads.
>>>
>>> Cassandra version : 2.0.11
>>> Driver version : 2.1.4
>>>
>>> Thanks
>>> Ajay
>>>
>>
>>
>

Re: Performance difference between Regular Statement Vs PreparedStatement

Posted by Eric Stevens <mi...@gmail.com>.
That's not a particularly good setup for load testing, I would try hard not
to draw any conclusions from it.  Most likely your biggest bottleneck is
I/O in your VM's, and any savings from using prepared statements dwarf in
comparison to the price of virtualization.  Point 2's effects are also
minimized because of being all-local.  This is fine for operational
practice, but not much use beyond that.

If you want to do local testing, I would recommend a tool called CCM:
https://github.com/pcmanus/ccm

This tool can easily set up and run multiple instances in your host OS
without having the overhead of virtualization.  Your performance tests will
be truer, but still hard to draw much meaningful insight from.

On Thu, Jan 29, 2015 at 9:19 AM, Ajay <aj...@gmail.com> wrote:

> Thanks Eric. I didn't know the point about the token aware routing.
>
> But with points 2 and 3 I didn't notice much improvement with prepared
> statement.  I have 2 cassandra nodes running in virtual boxes in the same
> machine and test client running in the same machine.
>
> Thanks
> Ajay
> Prepared statements can take advantage of token aware routing which IIRC
> non-prepared statements cannot in the DS Java Driver, so as your cluster
> grows you reduce the overhead of statement coordination (assuming you use
> token aware routing).  There should also be less data to transfer for
> shipping the query (the CQL portion is shipped once during the prepare
> stage, and only the data is shipped on subsequent executions).  You'll also
> save the cluster the overhead of repeatedly parsing your CQL statements.
>
> On Wed, Jan 28, 2015 at 11:50 PM, Ajay <aj...@gmail.com> wrote:
>
>> Hi All,
>>
>> I tried both insert and select query (using QueryBuilder) in Regular
>> statement and PreparedStatement in a multithreaded code to do the query say
>> 10k to 50k times. But I don't see any visible improvement using the
>> PreparedStatement. What could be the reason?
>>
>> Note : I am using the same Session object in multiple threads.
>>
>> Cassandra version : 2.0.11
>> Driver version : 2.1.4
>>
>> Thanks
>> Ajay
>>
>
>

Re: Performance difference between Regular Statement Vs PreparedStatement

Posted by Ajay <aj...@gmail.com>.
Thanks Eric. I didn't know the point about the token aware routing.

But with points 2 and 3 I didn't notice much improvement with prepared
statement.  I have 2 cassandra nodes running in virtual boxes in the same
machine and test client running in the same machine.

Thanks
Ajay
Prepared statements can take advantage of token aware routing which IIRC
non-prepared statements cannot in the DS Java Driver, so as your cluster
grows you reduce the overhead of statement coordination (assuming you use
token aware routing).  There should also be less data to transfer for
shipping the query (the CQL portion is shipped once during the prepare
stage, and only the data is shipped on subsequent executions).  You'll also
save the cluster the overhead of repeatedly parsing your CQL statements.

On Wed, Jan 28, 2015 at 11:50 PM, Ajay <aj...@gmail.com> wrote:

> Hi All,
>
> I tried both insert and select query (using QueryBuilder) in Regular
> statement and PreparedStatement in a multithreaded code to do the query say
> 10k to 50k times. But I don't see any visible improvement using the
> PreparedStatement. What could be the reason?
>
> Note : I am using the same Session object in multiple threads.
>
> Cassandra version : 2.0.11
> Driver version : 2.1.4
>
> Thanks
> Ajay
>

Re: Performance difference between Regular Statement Vs PreparedStatement

Posted by Eric Stevens <mi...@gmail.com>.
Prepared statements can take advantage of token aware routing which IIRC
non-prepared statements cannot in the DS Java Driver, so as your cluster
grows you reduce the overhead of statement coordination (assuming you use
token aware routing).  There should also be less data to transfer for
shipping the query (the CQL portion is shipped once during the prepare
stage, and only the data is shipped on subsequent executions).  You'll also
save the cluster the overhead of repeatedly parsing your CQL statements.

On Wed, Jan 28, 2015 at 11:50 PM, Ajay <aj...@gmail.com> wrote:

> Hi All,
>
> I tried both insert and select query (using QueryBuilder) in Regular
> statement and PreparedStatement in a multithreaded code to do the query say
> 10k to 50k times. But I don't see any visible improvement using the
> PreparedStatement. What could be the reason?
>
> Note : I am using the same Session object in multiple threads.
>
> Cassandra version : 2.0.11
> Driver version : 2.1.4
>
> Thanks
> Ajay
>