You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ashish Soni <as...@gmail.com> on 2016/02/19 20:48:31 UTC
Communication between two spark streaming Job
Hi ,
Is there any way we can communicate across two different spark streaming
job , as below is the scenario
we have two spark streaming job one to process metadata and one to process
actual data ( this needs metadata )
So if someone did the metadata update we need to update the cache
maintained in the second job so that it can take use of new metadata
Please help
Ashish
Re: Communication between two spark streaming Job
Posted by Chris Fregly <ch...@fregly.com>.
if you need update notifications, you could introduce ZooKeeper (eek!) or a Kafka queue between the jobs.
I've seen internal Kafka queues (relative to external spark streaming queues) used for this type of incremental update use case.
think of the updates as transaction logs.
> On Feb 19, 2016, at 10:35 PM, Ted Yu <yu...@gmail.com> wrote:
>
> Have you considered using a Key Value store which is accessible to both jobs ?
>
> The communication would take place through this store.
>
> Cheers
>
>> On Fri, Feb 19, 2016 at 11:48 AM, Ashish Soni <as...@gmail.com> wrote:
>> Hi ,
>>
>> Is there any way we can communicate across two different spark streaming job , as below is the scenario
>>
>> we have two spark streaming job one to process metadata and one to process actual data ( this needs metadata )
>>
>> So if someone did the metadata update we need to update the cache maintained in the second job so that it can take use of new metadata
>>
>> Please help
>>
>> Ashish
>
Re: Communication between two spark streaming Job
Posted by Ted Yu <yu...@gmail.com>.
Have you considered using a Key Value store which is accessible to both
jobs ?
The communication would take place through this store.
Cheers
On Fri, Feb 19, 2016 at 11:48 AM, Ashish Soni <as...@gmail.com> wrote:
> Hi ,
>
> Is there any way we can communicate across two different spark streaming
> job , as below is the scenario
>
> we have two spark streaming job one to process metadata and one to process
> actual data ( this needs metadata )
>
> So if someone did the metadata update we need to update the cache
> maintained in the second job so that it can take use of new metadata
>
> Please help
>
> Ashish
>