You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ajay <aj...@gmail.com> on 2015/05/15 19:02:17 UTC

Re: Caching the PreparedStatement (Java driver)

Hi Joseph,

Java driver currently caches the prepared statements but using a weak
reference i.e the cache will hold it as long the client code uses it. So in
turn means that we need to cache the same.

But I am also not sure of what happens when a cached prepared statement is
executed after cassandra nodes restart. Does the server prepared statements
cache is persisted or in memory?. If it is in memory, how do we handle
stale prepared statement in the cache?

Thanks
Ajay


On Fri, May 15, 2015 at 6:28 PM, ja <ja...@gmail.com> wrote:

> Hi,
>
> Isn't it a good to have feature for the java driver to maintain a cache of
> PreparedStatements (PS) . Any reason why it's left to the application to do
> the same? . I am currently implementing a cache of PS that is loaded at app
> startup, but how do i ensure this cache is always good to use? . Say,
> there's a restart on the Cassandra server side, this cache would be stale
> and I assume the next use of a PS from cache would fail. Any way to recover
> from this.
>
> Thanks,
> Joseph
>
> On Sunday, March 1, 2015 at 12:46:14 AM UTC+5:30, Vishy Kasar wrote:
>>
>>
>> On Feb 28, 2015, at 4:25 AM, Ajay <aj...@gmail.com> wrote:
>>
>> Hi,
>>
>> My earlier question was whether it is safe to cache PreparedStatement
>> (using Java driver) in the client side for which I got it confirmed by
>> Olivier.
>>
>> Now the question is do we really need to cache the PreparedStatement in
>> the client side?.
>>
>> Lets take a scenario as below:
>>
>> 1) Client fires a REST query "SELECT * from Test where Pk = val1";
>> 2) REST service prepares a statement "SELECT * from Test where Pk = ?"
>> 3) Executes the PreparedStatement by setting the values.
>> 4) Assume we don't cache the PreparedStatement
>> 5) Client fires another REST query "SELECT * from Test where Pk = val2";
>> 6) REST service prepares a statement "SELECT * from Test where Pk = ?"
>> 7) Executes the PreparedStatement by setting the values.
>>
>>
>> You should avoid re-preparing the statement (step 6 above). When you
>> create a prepared statement, a round trip to server is involved. So you
>> should create it once and reuse it. You can bind it with different values
>> and execute the bound statement each time.
>>
>> In this case, is there any benefit of using the PreparedStatement?
>>
>>     From the Java driver code, the Session.prepare(query) doesn't check
>> whether a similar query was prepared earlier or not. It directly call the
>> server passing the query. The return from the server is a PreparedId. Do
>> the server maintains a cache of Prepared queries or it still perform the
>> all the steps to prepare a query if the client calls to prepare the same
>> query more than once (using the same Session and Cluster instance which I
>> think doesn't matter)?.
>>
>> Thanks
>> Ajay
>>
>>
>> On Sat, Feb 28, 2015 at 9:17 AM, Ajay <aj...@gmail.com> wrote:
>>
>>> Thanks Olivier.
>>>
>>> Most of the REST query calls would come from other applications to
>>> write/read to/from Cassandra which means most queries from an application
>>> would be same (same column families but different  values).
>>>
>>> Thanks
>>> Ajay
>>> On 28-Feb-2015 6:05 am, "Olivier Michallat" <ol...@datastax.com>
>>> wrote:
>>>
>>>> Hi Ajay,
>>>>
>>>> Yes, it is safe to hold a reference to PreparedStatement instances in
>>>> your client code. If you always run the same pre-defined statements, you
>>>> can store them as fields in your resource classes.
>>>>
>>>> If your statements are dynamically generated (for example, inserting
>>>> different subsets of the columns depending on what was provided in the REST
>>>> payload), your caching approach is valid. When you evict a
>>>> PreparedStatement from your cache, the driver will also remove the
>>>> corresponding id from its internal cache. If you re-prepare it later it
>>>> might still be in the Cassandra-side cache, but that is not a problem.
>>>>
>>>> One caveat: you should be reasonably confident that your prepared
>>>> statements will be reused. If your query strings are always different,
>>>> preparing will bring no advantage.
>>>>
>>>> --
>>>> Olivier Michallat
>>>> Driver & tools engineer, DataStax
>>>>
>>>> On Fri, Feb 27, 2015 at 7:04 PM, Ajay <aj...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> We are building REST APIs for Cassandra using the Cassandra Java
>>>>> Driver.
>>>>>
>>>>> So as per the below guidlines from the documentation, we are caching
>>>>> the Cluster instance (per cluster) and the Session instance (per keyspace)
>>>>> as they are multi thread safe.
>>>>>
>>>>> http://www.datastax.com/documentation/developer/java-driver/2.0/java-driver/fourSimpleRules.html
>>>>>
>>>>> As the Cluster and Session instance(s) are cached in the application
>>>>> already and also as the PreparedStatement provide better performance, we
>>>>> thought to build the PreparedStatement for REST query implicitly (as REST
>>>>> calls are stateless) and cache the PreparedStatemen. Whenever a REST query
>>>>> is invoked, we look for a PreparedStatement in the cache and create and put
>>>>> it in the cache if it doesn't exists. (The cache is a in-memory fixed size
>>>>> LRU based).
>>>>>
>>>>> Is a safe approach to cache PreparedStatement in the client side?.
>>>>> Looking at the Java driver code, the Cluster class stores the
>>>>> PreparedStatements as a weak reference (to rebuild when a node is down or
>>>>> a  new node added).
>>>>>
>>>>> Thanks
>>>>> Ajay
>>>>>
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to java-driver-us...@lists.datastax.com.
>>>>>
>>>>
>>>>
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to java-driver-us...@lists.datastax.com.
>>>>
>>>
>>
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to java-driver-us...@lists.datastax.com.
>>
>>
>>  To unsubscribe from this group and stop receiving emails from it, send
> an email to java-driver-user+unsubscribe@lists.datastax.com.
>

Re: Caching the PreparedStatement (Java driver)

Posted by Tyler Hobbs <ty...@datastax.com>.
On Fri, May 15, 2015 at 12:02 PM, Ajay <aj...@gmail.com> wrote:

>
> But I am also not sure of what happens when a cached prepared statement is
> executed after cassandra nodes restart. Does the server prepared statements
> cache is persisted or in memory?.


For now, it's just in memory, so they are lost when the node is restarted.


> If it is in memory, how do we handle stale prepared statement in the cache?


If a prepared statement ID is used that Cassandra doesn't recognize (e.g.
after a node restart), it responds with a specific error to the driver.
When the driver sees this error, it automatically re-prepares the statement
against that node using the statement info from its own cache.  After the
statement has been re-prepared, it attempts to execute the query again.
This all happens transparently, so your application will not even be aware
of it (aside from an increase in latency).

There are plans to persist prepared statements in a system table:
https://issues.apache.org/jira/browse/CASSANDRA-8831


-- 
Tyler Hobbs
DataStax <http://datastax.com/>