You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Russell Spitzer <ru...@gmail.com> on 2017/07/21 13:33:40 UTC

Re: Spark (SQL / Structured Streaming) Cassandra - PreparedStatement

The scc includes the java driver. Which means you could just use java
driver functions. It also provides a serializable wrapper which has session
and prepared statement pooling. Something like

val cc = CassandraConnector(sc.getConf)
SomeFunctionWithAnIterator{
   it: SomeIterator =>
   cc.withSessionDo { session =>
      val ps = session.prepare("statement")
      it.map( row => session.executeAsync(ps.bind(row))
   }
} // Do something with the futures here

I wrote a blog post about this here
http://www.russellspitzer.com/2017/02/27/Concurrency-In-Spark/#concurrency-with-the-cassandra-java-driver

On Tue, Apr 11, 2017 at 4:05 AM Bastien DINE <ba...@coservit.com>
wrote:

> Hi everyone,
>
>
>
> I'm using Spark Structured Streaming for Machine Learning purpose in real
> time, and I want to stored predictions in my Cassandra cluster.
>
>
>
> Since I am in a streaming context, executing multiple times per seconds
> the same request, one mandatory optimization is to use PreparedStatement.
>
>
>
> In the cassandra spark driver (
> https://github.com/datastax/spark-cassandra-connector) there is no way to
> use PreparedStatement (in scala or python, i'm not considering java as a
> option)
>
>
>
> Should i use a scala  (https://github.com/outworkers/phantom) / python (
> https://github.com/datastax/python-driver) cassandra driver ?
>
> How does it work then, my connection object need to be serializable to be
> passed to workers ?
>
>
>
> If anyone can help me !
>
>
>
> Thanks :)
>
> Bastien
>