You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Cassa L <lc...@gmail.com> on 2016/05/10 04:08:40 UTC

Accessing Cassandra data from Spark Shell

Hi,
Has anyone tried accessing Cassandra data using SparkShell? How do you do
it? Can you use HiveContext for Cassandra data? I'm using community version
of Cassandra-3.0

Thanks,
LCassa

Re: Accessing Cassandra data from Spark Shell

Posted by Ted Yu <yu...@gmail.com>.
bq. Can you use HiveContext for Cassandra data?

Most likely the above cannot be done.

On Mon, May 9, 2016 at 9:08 PM, Cassa L <lc...@gmail.com> wrote:

> Hi,
> Has anyone tried accessing Cassandra data using SparkShell? How do you do
> it? Can you use HiveContext for Cassandra data? I'm using community version
> of Cassandra-3.0
>
> Thanks,
> LCassa
>

Re: Accessing Cassandra data from Spark Shell

Posted by Cassa L <lc...@gmail.com>.
I tried all combinations of spark-cassandra connector. Didn't work.
Finally, I downgraded spark to 1.5.1 and now it works.
LCassa

On Wed, May 18, 2016 at 11:11 AM, Mohammed Guller <mo...@glassbeam.com>
wrote:

> As Ben mentioned, Spark 1.5.2 does work with C*.  Make sure that you are
> using the correct version of the Spark Cassandra Connector.
>
>
>
>
>
> Mohammed
>
> Author: Big Data Analytics with Spark
> <http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>
>
>
>
> *From:* Ben Slater [mailto:ben.slater@instaclustr.com]
> *Sent:* Tuesday, May 17, 2016 11:00 PM
> *To:* user@cassandra.apache.org; Mohammed Guller
> *Cc:* user
>
> *Subject:* Re: Accessing Cassandra data from Spark Shell
>
>
>
> It definitely should be possible for 1.5.2 (I have used it with
> spark-shell and cassandra connector with 1.4.x). The main trick is in
> lining up all the versions and building an appropriate connector jar.
>
>
>
> Cheers
>
> Ben
>
>
>
> On Wed, 18 May 2016 at 15:40 Cassa L <lc...@gmail.com> wrote:
>
> Hi,
>
> I followed instructions to run SparkShell with Spark-1.6. It works fine.
> However, I need to use spark-1.5.2 version. With it, it does not work. I
> keep getting NoSuchMethod Errors. Is there any issue running Spark Shell
> for Cassandra using older version of Spark?
>
>
>
>
>
> Regards,
>
> LCassa
>
>
>
> On Tue, May 10, 2016 at 6:48 PM, Mohammed Guller <mo...@glassbeam.com>
> wrote:
>
> Yes, it is very simple to access Cassandra data using Spark shell.
>
>
>
> Step 1: Launch the spark-shell with the spark-cassandra-connector package
>
> $SPARK_HOME/bin/spark-shell --packages
> com.datastax.spark:spark-cassandra-connector_2.10:1.5.0
>
>
>
> Step 2: Create a DataFrame pointing to your Cassandra table
>
> val dfCassTable = sqlContext.read
>
>
> .format("org.apache.spark.sql.cassandra")
>
>                                                          .options(Map(
> "table" -> "your_column_family", "keyspace" -> "your_keyspace"))
>
>                                                          .load()
>
>
>
> From this point onward, you have complete access to the DataFrame API. You
> can even register it as a temporary table, if you would prefer to use
> SQL/HiveQL.
>
>
>
> Mohammed
>
> Author: Big Data Analytics with Spark
> <http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>
>
>
>
> *From:* Ben Slater [mailto:ben.slater@instaclustr.com]
> *Sent:* Monday, May 9, 2016 9:28 PM
> *To:* user@cassandra.apache.org; user
> *Subject:* Re: Accessing Cassandra data from Spark Shell
>
>
>
> You can use SparkShell to access Cassandra via the Spark Cassandra
> connector. The getting started article on our support page will probably
> give you a good steer to get started even if you’re not using Instaclustr:
> https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra-
>
>
>
> Cheers
>
> Ben
>
>
>
> On Tue, 10 May 2016 at 14:08 Cassa L <lc...@gmail.com> wrote:
>
> Hi,
>
> Has anyone tried accessing Cassandra data using SparkShell? How do you do
> it? Can you use HiveContext for Cassandra data? I'm using community version
> of Cassandra-3.0
>
>
>
> Thanks,
>
> LCassa
>
> --
>
> ————————
>
> Ben Slater
>
> Chief Product Officer, Instaclustr
>
> +61 437 929 798
>
>
>
> --
>
> ————————
>
> Ben Slater
>
> Chief Product Officer, Instaclustr
>
> +61 437 929 798
>

Re: Accessing Cassandra data from Spark Shell

Posted by Cassa L <lc...@gmail.com>.
I tried all combinations of spark-cassandra connector. Didn't work.
Finally, I downgraded spark to 1.5.1 and now it works.
LCassa

On Wed, May 18, 2016 at 11:11 AM, Mohammed Guller <mo...@glassbeam.com>
wrote:

> As Ben mentioned, Spark 1.5.2 does work with C*.  Make sure that you are
> using the correct version of the Spark Cassandra Connector.
>
>
>
>
>
> Mohammed
>
> Author: Big Data Analytics with Spark
> <http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>
>
>
>
> *From:* Ben Slater [mailto:ben.slater@instaclustr.com]
> *Sent:* Tuesday, May 17, 2016 11:00 PM
> *To:* user@cassandra.apache.org; Mohammed Guller
> *Cc:* user
>
> *Subject:* Re: Accessing Cassandra data from Spark Shell
>
>
>
> It definitely should be possible for 1.5.2 (I have used it with
> spark-shell and cassandra connector with 1.4.x). The main trick is in
> lining up all the versions and building an appropriate connector jar.
>
>
>
> Cheers
>
> Ben
>
>
>
> On Wed, 18 May 2016 at 15:40 Cassa L <lc...@gmail.com> wrote:
>
> Hi,
>
> I followed instructions to run SparkShell with Spark-1.6. It works fine.
> However, I need to use spark-1.5.2 version. With it, it does not work. I
> keep getting NoSuchMethod Errors. Is there any issue running Spark Shell
> for Cassandra using older version of Spark?
>
>
>
>
>
> Regards,
>
> LCassa
>
>
>
> On Tue, May 10, 2016 at 6:48 PM, Mohammed Guller <mo...@glassbeam.com>
> wrote:
>
> Yes, it is very simple to access Cassandra data using Spark shell.
>
>
>
> Step 1: Launch the spark-shell with the spark-cassandra-connector package
>
> $SPARK_HOME/bin/spark-shell --packages
> com.datastax.spark:spark-cassandra-connector_2.10:1.5.0
>
>
>
> Step 2: Create a DataFrame pointing to your Cassandra table
>
> val dfCassTable = sqlContext.read
>
>
> .format("org.apache.spark.sql.cassandra")
>
>                                                          .options(Map(
> "table" -> "your_column_family", "keyspace" -> "your_keyspace"))
>
>                                                          .load()
>
>
>
> From this point onward, you have complete access to the DataFrame API. You
> can even register it as a temporary table, if you would prefer to use
> SQL/HiveQL.
>
>
>
> Mohammed
>
> Author: Big Data Analytics with Spark
> <http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>
>
>
>
> *From:* Ben Slater [mailto:ben.slater@instaclustr.com]
> *Sent:* Monday, May 9, 2016 9:28 PM
> *To:* user@cassandra.apache.org; user
> *Subject:* Re: Accessing Cassandra data from Spark Shell
>
>
>
> You can use SparkShell to access Cassandra via the Spark Cassandra
> connector. The getting started article on our support page will probably
> give you a good steer to get started even if you’re not using Instaclustr:
> https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra-
>
>
>
> Cheers
>
> Ben
>
>
>
> On Tue, 10 May 2016 at 14:08 Cassa L <lc...@gmail.com> wrote:
>
> Hi,
>
> Has anyone tried accessing Cassandra data using SparkShell? How do you do
> it? Can you use HiveContext for Cassandra data? I'm using community version
> of Cassandra-3.0
>
>
>
> Thanks,
>
> LCassa
>
> --
>
> ————————
>
> Ben Slater
>
> Chief Product Officer, Instaclustr
>
> +61 437 929 798
>
>
>
> --
>
> ————————
>
> Ben Slater
>
> Chief Product Officer, Instaclustr
>
> +61 437 929 798
>

RE: Accessing Cassandra data from Spark Shell

Posted by Mohammed Guller <mo...@glassbeam.com>.
As Ben mentioned, Spark 1.5.2 does work with C*.  Make sure that you are using the correct version of the Spark Cassandra Connector.


Mohammed
Author: Big Data Analytics with Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Ben Slater [mailto:ben.slater@instaclustr.com]
Sent: Tuesday, May 17, 2016 11:00 PM
To: user@cassandra.apache.org; Mohammed Guller
Cc: user
Subject: Re: Accessing Cassandra data from Spark Shell

It definitely should be possible for 1.5.2 (I have used it with spark-shell and cassandra connector with 1.4.x). The main trick is in lining up all the versions and building an appropriate connector jar.

Cheers
Ben

On Wed, 18 May 2016 at 15:40 Cassa L <lc...@gmail.com>> wrote:
Hi,
I followed instructions to run SparkShell with Spark-1.6. It works fine. However, I need to use spark-1.5.2 version. With it, it does not work. I keep getting NoSuchMethod Errors. Is there any issue running Spark Shell for Cassandra using older version of Spark?


Regards,
LCassa

On Tue, May 10, 2016 at 6:48 PM, Mohammed Guller <mo...@glassbeam.com>> wrote:
Yes, it is very simple to access Cassandra data using Spark shell.

Step 1: Launch the spark-shell with the spark-cassandra-connector package
$SPARK_HOME/bin/spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.10:1.5.0

Step 2: Create a DataFrame pointing to your Cassandra table
val dfCassTable = sqlContext.read
                                                         .format("org.apache.spark.sql.cassandra")
                                                         .options(Map( "table" -> "your_column_family", "keyspace" -> "your_keyspace"))
                                                         .load()

From this point onward, you have complete access to the DataFrame API. You can even register it as a temporary table, if you would prefer to use SQL/HiveQL.

Mohammed
Author: Big Data Analytics with Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Ben Slater [mailto:ben.slater@instaclustr.com<ma...@instaclustr.com>]
Sent: Monday, May 9, 2016 9:28 PM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>; user
Subject: Re: Accessing Cassandra data from Spark Shell

You can use SparkShell to access Cassandra via the Spark Cassandra connector. The getting started article on our support page will probably give you a good steer to get started even if you’re not using Instaclustr: https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra-

Cheers
Ben

On Tue, 10 May 2016 at 14:08 Cassa L <lc...@gmail.com>> wrote:
Hi,
Has anyone tried accessing Cassandra data using SparkShell? How do you do it? Can you use HiveContext for Cassandra data? I'm using community version of Cassandra-3.0

Thanks,
LCassa
--
————————
Ben Slater
Chief Product Officer, Instaclustr
+61 437 929 798<tel:%2B61%20437%20929%20798>

--
————————
Ben Slater
Chief Product Officer, Instaclustr
+61 437 929 798

RE: Accessing Cassandra data from Spark Shell

Posted by Mohammed Guller <mo...@glassbeam.com>.
As Ben mentioned, Spark 1.5.2 does work with C*.  Make sure that you are using the correct version of the Spark Cassandra Connector.


Mohammed
Author: Big Data Analytics with Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Ben Slater [mailto:ben.slater@instaclustr.com]
Sent: Tuesday, May 17, 2016 11:00 PM
To: user@cassandra.apache.org; Mohammed Guller
Cc: user
Subject: Re: Accessing Cassandra data from Spark Shell

It definitely should be possible for 1.5.2 (I have used it with spark-shell and cassandra connector with 1.4.x). The main trick is in lining up all the versions and building an appropriate connector jar.

Cheers
Ben

On Wed, 18 May 2016 at 15:40 Cassa L <lc...@gmail.com>> wrote:
Hi,
I followed instructions to run SparkShell with Spark-1.6. It works fine. However, I need to use spark-1.5.2 version. With it, it does not work. I keep getting NoSuchMethod Errors. Is there any issue running Spark Shell for Cassandra using older version of Spark?


Regards,
LCassa

On Tue, May 10, 2016 at 6:48 PM, Mohammed Guller <mo...@glassbeam.com>> wrote:
Yes, it is very simple to access Cassandra data using Spark shell.

Step 1: Launch the spark-shell with the spark-cassandra-connector package
$SPARK_HOME/bin/spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.10:1.5.0

Step 2: Create a DataFrame pointing to your Cassandra table
val dfCassTable = sqlContext.read
                                                         .format("org.apache.spark.sql.cassandra")
                                                         .options(Map( "table" -> "your_column_family", "keyspace" -> "your_keyspace"))
                                                         .load()

From this point onward, you have complete access to the DataFrame API. You can even register it as a temporary table, if you would prefer to use SQL/HiveQL.

Mohammed
Author: Big Data Analytics with Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Ben Slater [mailto:ben.slater@instaclustr.com<ma...@instaclustr.com>]
Sent: Monday, May 9, 2016 9:28 PM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>; user
Subject: Re: Accessing Cassandra data from Spark Shell

You can use SparkShell to access Cassandra via the Spark Cassandra connector. The getting started article on our support page will probably give you a good steer to get started even if you’re not using Instaclustr: https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra-

Cheers
Ben

On Tue, 10 May 2016 at 14:08 Cassa L <lc...@gmail.com>> wrote:
Hi,
Has anyone tried accessing Cassandra data using SparkShell? How do you do it? Can you use HiveContext for Cassandra data? I'm using community version of Cassandra-3.0

Thanks,
LCassa
--
————————
Ben Slater
Chief Product Officer, Instaclustr
+61 437 929 798<tel:%2B61%20437%20929%20798>

--
————————
Ben Slater
Chief Product Officer, Instaclustr
+61 437 929 798

Re: Accessing Cassandra data from Spark Shell

Posted by Ben Slater <be...@instaclustr.com>.
It definitely should be possible for 1.5.2 (I have used it with spark-shell
and cassandra connector with 1.4.x). The main trick is in lining up all the
versions and building an appropriate connector jar.

Cheers
Ben

On Wed, 18 May 2016 at 15:40 Cassa L <lc...@gmail.com> wrote:

> Hi,
> I followed instructions to run SparkShell with Spark-1.6. It works fine.
> However, I need to use spark-1.5.2 version. With it, it does not work. I
> keep getting NoSuchMethod Errors. Is there any issue running Spark Shell
> for Cassandra using older version of Spark?
>
>
> Regards,
> LCassa
>
> On Tue, May 10, 2016 at 6:48 PM, Mohammed Guller <mo...@glassbeam.com>
> wrote:
>
>> Yes, it is very simple to access Cassandra data using Spark shell.
>>
>>
>>
>> Step 1: Launch the spark-shell with the spark-cassandra-connector package
>>
>> $SPARK_HOME/bin/spark-shell --packages
>> com.datastax.spark:spark-cassandra-connector_2.10:1.5.0
>>
>>
>>
>> Step 2: Create a DataFrame pointing to your Cassandra table
>>
>> val dfCassTable = sqlContext.read
>>
>>
>> .format("org.apache.spark.sql.cassandra")
>>
>>                                                          .options(Map(
>> "table" -> "your_column_family", "keyspace" -> "your_keyspace"))
>>
>>                                                          .load()
>>
>>
>>
>> From this point onward, you have complete access to the DataFrame API.
>> You can even register it as a temporary table, if you would prefer to use
>> SQL/HiveQL.
>>
>>
>>
>> Mohammed
>>
>> Author: Big Data Analytics with Spark
>> <http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>
>>
>>
>>
>> *From:* Ben Slater [mailto:ben.slater@instaclustr.com]
>> *Sent:* Monday, May 9, 2016 9:28 PM
>> *To:* user@cassandra.apache.org; user
>> *Subject:* Re: Accessing Cassandra data from Spark Shell
>>
>>
>>
>> You can use SparkShell to access Cassandra via the Spark Cassandra
>> connector. The getting started article on our support page will probably
>> give you a good steer to get started even if you’re not using Instaclustr:
>> https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra-
>>
>>
>>
>> Cheers
>>
>> Ben
>>
>>
>>
>> On Tue, 10 May 2016 at 14:08 Cassa L <lc...@gmail.com> wrote:
>>
>> Hi,
>>
>> Has anyone tried accessing Cassandra data using SparkShell? How do you do
>> it? Can you use HiveContext for Cassandra data? I'm using community version
>> of Cassandra-3.0
>>
>>
>>
>> Thanks,
>>
>> LCassa
>>
>> --
>>
>> ————————
>>
>> Ben Slater
>>
>> Chief Product Officer, Instaclustr
>>
>> +61 437 929 798
>>
>
> --
————————
Ben Slater
Chief Product Officer, Instaclustr
+61 437 929 798

Re: Accessing Cassandra data from Spark Shell

Posted by Cassa L <lc...@gmail.com>.
Hi,
I followed instructions to run SparkShell with Spark-1.6. It works fine.
However, I need to use spark-1.5.2 version. With it, it does not work. I
keep getting NoSuchMethod Errors. Is there any issue running Spark Shell
for Cassandra using older version of Spark?


Regards,
LCassa

On Tue, May 10, 2016 at 6:48 PM, Mohammed Guller <mo...@glassbeam.com>
wrote:

> Yes, it is very simple to access Cassandra data using Spark shell.
>
>
>
> Step 1: Launch the spark-shell with the spark-cassandra-connector package
>
> $SPARK_HOME/bin/spark-shell --packages
> com.datastax.spark:spark-cassandra-connector_2.10:1.5.0
>
>
>
> Step 2: Create a DataFrame pointing to your Cassandra table
>
> val dfCassTable = sqlContext.read
>
>
> .format("org.apache.spark.sql.cassandra")
>
>                                                          .options(Map(
> "table" -> "your_column_family", "keyspace" -> "your_keyspace"))
>
>                                                          .load()
>
>
>
> From this point onward, you have complete access to the DataFrame API. You
> can even register it as a temporary table, if you would prefer to use
> SQL/HiveQL.
>
>
>
> Mohammed
>
> Author: Big Data Analytics with Spark
> <http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>
>
>
>
> *From:* Ben Slater [mailto:ben.slater@instaclustr.com]
> *Sent:* Monday, May 9, 2016 9:28 PM
> *To:* user@cassandra.apache.org; user
> *Subject:* Re: Accessing Cassandra data from Spark Shell
>
>
>
> You can use SparkShell to access Cassandra via the Spark Cassandra
> connector. The getting started article on our support page will probably
> give you a good steer to get started even if you’re not using Instaclustr:
> https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra-
>
>
>
> Cheers
>
> Ben
>
>
>
> On Tue, 10 May 2016 at 14:08 Cassa L <lc...@gmail.com> wrote:
>
> Hi,
>
> Has anyone tried accessing Cassandra data using SparkShell? How do you do
> it? Can you use HiveContext for Cassandra data? I'm using community version
> of Cassandra-3.0
>
>
>
> Thanks,
>
> LCassa
>
> --
>
> ————————
>
> Ben Slater
>
> Chief Product Officer, Instaclustr
>
> +61 437 929 798
>

Re: Accessing Cassandra data from Spark Shell

Posted by Cassa L <lc...@gmail.com>.
Hi,
I followed instructions to run SparkShell with Spark-1.6. It works fine.
However, I need to use spark-1.5.2 version. With it, it does not work. I
keep getting NoSuchMethod Errors. Is there any issue running Spark Shell
for Cassandra using older version of Spark?


Regards,
LCassa

On Tue, May 10, 2016 at 6:48 PM, Mohammed Guller <mo...@glassbeam.com>
wrote:

> Yes, it is very simple to access Cassandra data using Spark shell.
>
>
>
> Step 1: Launch the spark-shell with the spark-cassandra-connector package
>
> $SPARK_HOME/bin/spark-shell --packages
> com.datastax.spark:spark-cassandra-connector_2.10:1.5.0
>
>
>
> Step 2: Create a DataFrame pointing to your Cassandra table
>
> val dfCassTable = sqlContext.read
>
>
> .format("org.apache.spark.sql.cassandra")
>
>                                                          .options(Map(
> "table" -> "your_column_family", "keyspace" -> "your_keyspace"))
>
>                                                          .load()
>
>
>
> From this point onward, you have complete access to the DataFrame API. You
> can even register it as a temporary table, if you would prefer to use
> SQL/HiveQL.
>
>
>
> Mohammed
>
> Author: Big Data Analytics with Spark
> <http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>
>
>
>
> *From:* Ben Slater [mailto:ben.slater@instaclustr.com]
> *Sent:* Monday, May 9, 2016 9:28 PM
> *To:* user@cassandra.apache.org; user
> *Subject:* Re: Accessing Cassandra data from Spark Shell
>
>
>
> You can use SparkShell to access Cassandra via the Spark Cassandra
> connector. The getting started article on our support page will probably
> give you a good steer to get started even if you’re not using Instaclustr:
> https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra-
>
>
>
> Cheers
>
> Ben
>
>
>
> On Tue, 10 May 2016 at 14:08 Cassa L <lc...@gmail.com> wrote:
>
> Hi,
>
> Has anyone tried accessing Cassandra data using SparkShell? How do you do
> it? Can you use HiveContext for Cassandra data? I'm using community version
> of Cassandra-3.0
>
>
>
> Thanks,
>
> LCassa
>
> --
>
> ————————
>
> Ben Slater
>
> Chief Product Officer, Instaclustr
>
> +61 437 929 798
>

RE: Accessing Cassandra data from Spark Shell

Posted by Mohammed Guller <mo...@glassbeam.com>.
Yes, it is very simple to access Cassandra data using Spark shell.

Step 1: Launch the spark-shell with the spark-cassandra-connector package
$SPARK_HOME/bin/spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.10:1.5.0

Step 2: Create a DataFrame pointing to your Cassandra table
val dfCassTable = sqlContext.read
                                                         .format("org.apache.spark.sql.cassandra")
                                                         .options(Map( "table" -> "your_column_family", "keyspace" -> "your_keyspace"))
                                                         .load()

From this point onward, you have complete access to the DataFrame API. You can even register it as a temporary table, if you would prefer to use SQL/HiveQL.

Mohammed
Author: Big Data Analytics with Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Ben Slater [mailto:ben.slater@instaclustr.com]
Sent: Monday, May 9, 2016 9:28 PM
To: user@cassandra.apache.org; user
Subject: Re: Accessing Cassandra data from Spark Shell

You can use SparkShell to access Cassandra via the Spark Cassandra connector. The getting started article on our support page will probably give you a good steer to get started even if you’re not using Instaclustr: https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra-

Cheers
Ben

On Tue, 10 May 2016 at 14:08 Cassa L <lc...@gmail.com>> wrote:
Hi,
Has anyone tried accessing Cassandra data using SparkShell? How do you do it? Can you use HiveContext for Cassandra data? I'm using community version of Cassandra-3.0

Thanks,
LCassa
--
————————
Ben Slater
Chief Product Officer, Instaclustr
+61 437 929 798

RE: Accessing Cassandra data from Spark Shell

Posted by Mohammed Guller <mo...@glassbeam.com>.
Yes, it is very simple to access Cassandra data using Spark shell.

Step 1: Launch the spark-shell with the spark-cassandra-connector package
$SPARK_HOME/bin/spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.10:1.5.0

Step 2: Create a DataFrame pointing to your Cassandra table
val dfCassTable = sqlContext.read
                                                         .format("org.apache.spark.sql.cassandra")
                                                         .options(Map( "table" -> "your_column_family", "keyspace" -> "your_keyspace"))
                                                         .load()

From this point onward, you have complete access to the DataFrame API. You can even register it as a temporary table, if you would prefer to use SQL/HiveQL.

Mohammed
Author: Big Data Analytics with Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Ben Slater [mailto:ben.slater@instaclustr.com]
Sent: Monday, May 9, 2016 9:28 PM
To: user@cassandra.apache.org; user
Subject: Re: Accessing Cassandra data from Spark Shell

You can use SparkShell to access Cassandra via the Spark Cassandra connector. The getting started article on our support page will probably give you a good steer to get started even if you’re not using Instaclustr: https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra-

Cheers
Ben

On Tue, 10 May 2016 at 14:08 Cassa L <lc...@gmail.com>> wrote:
Hi,
Has anyone tried accessing Cassandra data using SparkShell? How do you do it? Can you use HiveContext for Cassandra data? I'm using community version of Cassandra-3.0

Thanks,
LCassa
--
————————
Ben Slater
Chief Product Officer, Instaclustr
+61 437 929 798

Re: Accessing Cassandra data from Spark Shell

Posted by Ben Slater <be...@instaclustr.com>.
You can use SparkShell to access Cassandra via the Spark Cassandra
connector. The getting started article on our support page will probably
give you a good steer to get started even if you’re not using Instaclustr:
https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra-

Cheers
Ben

On Tue, 10 May 2016 at 14:08 Cassa L <lc...@gmail.com> wrote:

> Hi,
> Has anyone tried accessing Cassandra data using SparkShell? How do you do
> it? Can you use HiveContext for Cassandra data? I'm using community version
> of Cassandra-3.0
>
> Thanks,
> LCassa
>
-- 
————————
Ben Slater
Chief Product Officer, Instaclustr
+61 437 929 798

Re: Accessing Cassandra data from Spark Shell

Posted by Ted Yu <yu...@gmail.com>.
bq. Can you use HiveContext for Cassandra data?

Most likely the above cannot be done.

On Mon, May 9, 2016 at 9:08 PM, Cassa L <lc...@gmail.com> wrote:

> Hi,
> Has anyone tried accessing Cassandra data using SparkShell? How do you do
> it? Can you use HiveContext for Cassandra data? I'm using community version
> of Cassandra-3.0
>
> Thanks,
> LCassa
>