You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Afshartous, Nick" <na...@turbine.com> on 2016/01/16 01:25:22 UTC

Consuming commands from a queue

Hi,


We have a streaming job that consumes from Kafka and outputs to S3.  We're going to have the job also send commands (to copy from S3 to Redshift) into a different Kafka topic.


What would be the best framework for consuming and processing the copy commands ?  We're considering creating a second streaming job or using Akka.


Thanks for any suggestions,

--

    Nick

Re: Consuming commands from a queue

Posted by "Afshartous, Nick" <na...@turbine.com>.
Thanks Cody.


One reason I was thinking of using Akka is that some of the copies take much longer than others (or get stuck).  We've seen this with our current streaming job.  This can cause the entire streaming micro-batch to take longer.


If we had a set of Akka actors than each copy would be isolated.

--

    Nick


________________________________
From: Cody Koeninger <co...@koeninger.org>
Sent: Friday, January 15, 2016 11:46 PM
To: Afshartous, Nick
Cc: user@spark.apache.org
Subject: Re: Consuming commands from a queue

Reading commands from kafka and triggering a redshift copy is sufficiently simple it could just be a bash script.  But if you've already got a spark streaming job set up, may as well use it for consistency's sake.  There's definitely no need to mess around with akka.

On Fri, Jan 15, 2016 at 6:25 PM, Afshartous, Nick <na...@turbine.com>> wrote:


Hi,


We have a streaming job that consumes from Kafka and outputs to S3.  We're going to have the job also send commands (to copy from S3 to Redshift) into a different Kafka topic.


What would be the best framework for consuming and processing the copy commands ?  We're considering creating a second streaming job or using Akka.


Thanks for any suggestions,

--

    Nick


Re: Consuming commands from a queue

Posted by Cody Koeninger <co...@koeninger.org>.
Reading commands from kafka and triggering a redshift copy is sufficiently
simple it could just be a bash script.  But if you've already got a spark
streaming job set up, may as well use it for consistency's sake.  There's
definitely no need to mess around with akka.

On Fri, Jan 15, 2016 at 6:25 PM, Afshartous, Nick <na...@turbine.com>
wrote:

>
> Hi,
>
>
> We have a streaming job that consumes from Kafka and outputs to S3.  We're
> going to have the job also send commands (to copy from S3 to Redshift) into
> a different Kafka topic.
>
>
> What would be the best framework for consuming and processing the copy
> commands ?  We're considering creating a second streaming job or using Akka.
>
>
> Thanks for any suggestions,
>
> --
>
>     Nick
>