You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by umargeek <um...@gmail.com> on 2017/10/26 16:22:00 UTC

Suggestions on using scala/python for Spark Streaming

We are building a spark streaming application which is process and time
intensive and currently using python API but looking forward for suggestions
whether to use Scala over python such as pro's and con's as we are planning
to production setup as next step?

Thanks,
Umar 



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Suggestions on using scala/python for Spark Streaming

Posted by Sebastian Piu <se...@gmail.com>.
Have a look at how pyspark works in conjunction with spark as it is not
just a matter of language preference. There are several implications and a
performance price to pay if you go with python.

At the end of the day only you can answer whether that price is worth over
retraining your team in another language, but if performance is a key
decision factor then there isn't much debate and go for Scala

On Thu, 26 Oct 2017, 17:31 lucas.gary@gmail.com, <lu...@gmail.com>
wrote:

> I don't have any specific wisdom for you on that front.  But I've always
> been served well by the 'Try both' approach.
>
> Set up your benchmarks, configure both setups...  You don't have to go the
> whole hog, but just enough to get a mostly realistic implementation
> functional.  Run them both with some captured / fixture data...  And
> compare.
>
> I personally haven't come across a situation where you just have to go
> scala, but I've come across multiple situations where it was preferable but
> not by a big enough margin to retool a team and a product.
>
> On the plus side you'll be well setup for integration tests with whichever
> system you end up rolling out!
>
> Good luck!  and i'd love to hear any findings discovery you may come
> across!
>
> Gary Lucas
>
> On 26 October 2017 at 09:22, umargeek <um...@gmail.com>
> wrote:
>
>> We are building a spark streaming application which is process and time
>> intensive and currently using python API but looking forward for
>> suggestions
>> whether to use Scala over python such as pro's and con's as we are
>> planning
>> to production setup as next step?
>>
>> Thanks,
>> Umar
>>
>>
>>
>> --
>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
>

Re: Suggestions on using scala/python for Spark Streaming

Posted by "lucas.gary@gmail.com" <lu...@gmail.com>.
I don't have any specific wisdom for you on that front.  But I've always
been served well by the 'Try both' approach.

Set up your benchmarks, configure both setups...  You don't have to go the
whole hog, but just enough to get a mostly realistic implementation
functional.  Run them both with some captured / fixture data...  And
compare.

I personally haven't come across a situation where you just have to go
scala, but I've come across multiple situations where it was preferable but
not by a big enough margin to retool a team and a product.

On the plus side you'll be well setup for integration tests with whichever
system you end up rolling out!

Good luck!  and i'd love to hear any findings discovery you may come across!

Gary Lucas

On 26 October 2017 at 09:22, umargeek <um...@gmail.com> wrote:

> We are building a spark streaming application which is process and time
> intensive and currently using python API but looking forward for
> suggestions
> whether to use Scala over python such as pro's and con's as we are planning
> to production setup as next step?
>
> Thanks,
> Umar
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>