You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Eliza <el...@ChinaBuckets.com> on 2019/08/21 06:36:08 UTC

Questions for platform to choose

Hello,

We have all of spark, flink, storm, kafka installed.
For realtime streaming calculation, which one is the best above?
Like other big players, the logs in our stack are huge.

Thanks.

Re: Questions for platform to choose

Posted by Liam Clarke <li...@adscale.co.nz>.
 C Hi Eliza,

Kafka Streaming, Spark Streaming, Flink and Storm are all good. They also
all have their caveats. It's really hard to say that X is the best.

For example, Kafka Streaming can't read from one Kafka cluster and write to
another, but Spark can.

But then Spark offers two flavours of streaming, the low level and fiddly
to integrate with Kafka RDD based streaming, or the higher level Dataframe
based structured streaming that integrates a lot easier with Kafka, but
currently doesn't support a group by followed by a group by.

Spark requires either manually creating and managing a cluster to scale, or
else using Yarn or EMR, whereas Kafka Streaming is straightforward to scale
by deploying another copy of the app.

I can keep going...

You really need to analyse what you're trying to achieve, what existing
expertise you have in your organisation, and then just try the various
technologies.

On Wed, 21 Aug. 2019, 6:42 pm Eliza, <el...@chinabuckets.com> wrote:

> Hello,
>
> We have all of spark, flink, storm, kafka installed.
> For realtime streaming calculation, which one is the best above?
> Like other big players, the logs in our stack are huge.
>
> Thanks.
>