You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Rohit Rai <ro...@tuplejump.com> on 2014/10/03 20:16:37 UTC

[ANN] SparkSQL support for Cassandra with Calliope

Hi All,

An year ago we started this journey and laid the path for Spark + Cassandra
stack. We established the ground work and direction for Spark Cassandra
connectors and we have been happy seeing the results.

With Spark 1.1.0 and SparkSQL release, we its time to take Calliope
<http://tuplejump.github.io/calliope/> to the logical next level also
paving the way for much more advanced functionality to come.

Yesterday we released Calliope 1.1.0 Community Tech Preview
<https://twitter.com/tuplejump/status/517739186124627968>, which brings
Native SparkSQL support for Cassandra. The further details are available
here <http://tuplejump.github.io/calliope/tech-preview.html>.

This release showcases in core spark-sql
<http://tuplejump.github.io/calliope/start-with-sql.html>, hiveql
<http://tuplejump.github.io/calliope/start-with-hive.html> and
HiveThriftServer <http://tuplejump.github.io/calliope/calliope-server.html>
 support.

I differentiate it as "native" spark-sql integration as it doesn't rely on
Cassandra's hive connectors (like Cash or DSE) and saves a level of
indirection through Hive.

It also allows us to harness Spark's analyzer and optimizer in future to
work out the best execution plan targeting a balance between Cassandra's
querying restrictions and Sparks in memory processing.

As far as we know this it the first and only third party data store
connector for SparkSQL. This is a CTP release as it relies on Spark
internals that still don't have/stabilized a developer API and we will work
with the Spark Community in documenting the requirements and working
towards a standard and stable API for third party data store integration.

On another note, we no longer require you to signup to access the early
access code repository.

Inviting all of you try it and give us your valuable feedback.

Regards,

Rohit
*Founder & CEO, **Tuplejump, Inc.*
____________________________
www.tuplejump.com
*The Data Engineering Platform*

Re: [ANN] SparkSQL support for Cassandra with Calliope

Posted by Brian O'Neill <bo...@alumni.brown.edu>.

Well done Rohit. (and crew)

-brian

---
Brian O'Neill
Chief Technology Officer


Health Market Science
The Science of Better Results
2700 Horizon Drive  King of Prussia, PA  19406
M: 215.588.6024  @boneill42 <http://www.twitter.com/boneill42>   
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Rohit Rai <ro...@tuplejump.com>
Reply-To:  <us...@cassandra.apache.org>
Date:  Friday, October 3, 2014 at 2:16 PM
To:  <us...@cassandra.apache.org>
Subject:  [ANN] SparkSQL support for Cassandra with Calliope

Hi All,

An year ago we started this journey and laid the path for Spark + Cassandra
stack. We established the ground work and direction for Spark Cassandra
connectors and we have been happy seeing the results.

With Spark 1.1.0 and SparkSQL release, we its time to take Calliope
<http://tuplejump.github.io/calliope/>  to the logical next level also
paving the way for much more advanced functionality to come.

Yesterday we released Calliope 1.1.0 Community Tech Preview
<https://twitter.com/tuplejump/status/517739186124627968> , which brings
Native SparkSQL support for Cassandra. The further details are available
here <http://tuplejump.github.io/calliope/tech-preview.html> .

This release showcases in core spark-sql
<http://tuplejump.github.io/calliope/start-with-sql.html> , hiveql
<http://tuplejump.github.io/calliope/start-with-hive.html>  and
HiveThriftServer <http://tuplejump.github.io/calliope/calliope-server.html>
support. 

I differentiate it as "native" spark-sql integration as it doesn't rely on
Cassandra's hive connectors (like Cash or DSE) and saves a level of
indirection through Hive.

It also allows us to harness Spark's analyzer and optimizer in future to
work out the best execution plan targeting a balance between Cassandra's
querying restrictions and Sparks in memory processing.

As far as we know this it the first and only third party data store
connector for SparkSQL. This is a CTP release as it relies on Spark
internals that still don't have/stabilized a developer API and we will work
with the Spark Community in documenting the requirements and working towards
a standard and stable API for third party data store integration.

On another note, we no longer require you to signup to access the early
access code repository.

Inviting all of you try it and give us your valuable feedback.

Regards,

Rohit
Founder & CEO, Tuplejump, Inc.
____________________________
www.tuplejump.com <http://www.tuplejump.com>
The Data Engineering Platform

Re: [ANN] SparkSQL support for Cassandra with Calliope

Posted by Peter Lin <wo...@gmail.com>.

it's nice to see spark + cassandra work

This give users an alternative to CQL that has more SQL functionality

On Fri, Oct 3, 2014 at 2:16 PM, Rohit Rai <ro...@tuplejump.com> wrote:

> Hi All,
>
> An year ago we started this journey and laid the path for Spark +
> Cassandra stack. We established the ground work and direction for Spark
> Cassandra connectors and we have been happy seeing the results.
>
> With Spark 1.1.0 and SparkSQL release, we its time to take Calliope
> <http://tuplejump.github.io/calliope/> to the logical next level also
> paving the way for much more advanced functionality to come.
>
> Yesterday we released Calliope 1.1.0 Community Tech Preview
> <https://twitter.com/tuplejump/status/517739186124627968>, which brings
> Native SparkSQL support for Cassandra. The further details are available
> here <http://tuplejump.github.io/calliope/tech-preview.html>.
>
> This release showcases in core spark-sql
> <http://tuplejump.github.io/calliope/start-with-sql.html>, hiveql
> <http://tuplejump.github.io/calliope/start-with-hive.html> and
> HiveThriftServer
> <http://tuplejump.github.io/calliope/calliope-server.html> support.
>
> I differentiate it as "native" spark-sql integration as it doesn't rely on
> Cassandra's hive connectors (like Cash or DSE) and saves a level of
> indirection through Hive.
>
> It also allows us to harness Spark's analyzer and optimizer in future to
> work out the best execution plan targeting a balance between Cassandra's
> querying restrictions and Sparks in memory processing.
>
> As far as we know this it the first and only third party data store
> connector for SparkSQL. This is a CTP release as it relies on Spark
> internals that still don't have/stabilized a developer API and we will work
> with the Spark Community in documenting the requirements and working
> towards a standard and stable API for third party data store integration.
>
> On another note, we no longer require you to signup to access the early
> access code repository.
>
> Inviting all of you try it and give us your valuable feedback.
>
> Regards,
>
> Rohit
> *Founder & CEO, **Tuplejump, Inc.*
> ____________________________
> www.tuplejump.com
> *The Data Engineering Platform*
>