You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Kevin Burton <bu...@spinn3r.com> on 2016/02/15 04:51:57 UTC

Best way to bring up Spark with Cassandra (and Elasticsearch) in production.

Afternoon.

About 6 months ago I tried (and failed) to get Spark and Cassandra working
together in production due to dependency hell.

I'm going to give it another try!

Here's my general strategy.

I'm going to create a maven module for my code... with spark dependencies.

Then I'm going to get that to run and have unit tests for reading from
files and writing the data back out the way I want via spark jobs.

Then I'm going to setup cassandra unit to embed cassandra in my project.
Then I'm going to point Spark to Cassandra and have the same above code
work with Cassandra but instead of reading from a file it reads/writes to
C*.

Then once testing is working I'm going to setup spark in cluster mode with
the same dependencies.

Does this sound like a reasonable strategy?

Kevin

-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

Re: Best way to bring up Spark with Cassandra (and Elasticsearch) in production.

Posted by Ted Yu <yu...@gmail.com>.

Sounds reasonable.

Please consider posting question on Spark C* connector on their mailing
list if you have any.

On Sun, Feb 14, 2016 at 7:51 PM, Kevin Burton <bu...@spinn3r.com> wrote:

> Afternoon.
>
> About 6 months ago I tried (and failed) to get Spark and Cassandra working
> together in production due to dependency hell.
>
> I'm going to give it another try!
>
> Here's my general strategy.
>
> I'm going to create a maven module for my code... with spark dependencies.
>
> Then I'm going to get that to run and have unit tests for reading from
> files and writing the data back out the way I want via spark jobs.
>
> Then I'm going to setup cassandra unit to embed cassandra in my project.
> Then I'm going to point Spark to Cassandra and have the same above code
> work with Cassandra but instead of reading from a file it reads/writes to
> C*.
>
> Then once testing is working I'm going to setup spark in cluster mode with
> the same dependencies.
>
> Does this sound like a reasonable strategy?
>
> Kevin
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>
>