You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by eric hoffmann <si...@gmail.com> on 2018/11/23 09:33:26 UTC

Call batch job in streaming context?

Hi
Is it possible to call batch job on a streaming context?
what i want to do is:
for a given input event, fetch cassandra elements based on event data,
apply transformation on them and apply a ranking when all elements fetched
by cassandra are processed.
If i do this in batch mode i would have to submit a job on each events and
i can have an event every 45 seconds.
Is there any alternative? can i start a batch job that will receive some
external request, process it and wait for another request?
thx
Eric

Re: Call batch job in streaming context?

Posted by bastien dine <ba...@gmail.com>.
Hi Eric,

You can run a job from another one, using the REST API
This is the only way we have found to launch a batch job from a streaming
job

------------------

Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io


Le ven. 23 nov. 2018 à 11:52, Piotr Nowojski <pi...@data-artisans.com> a
écrit :

> Hi,
>
> I’m not sure if I understand your problem and your context, but spawning a
> batch job every 45 seconds doesn’t sound as a that bad idea (as long as the
> job is short).
>
> Another idea would be to incorporate this batch job inside your streaming
> job, for example by reading from Cassandra using an AsyncIO operator:
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/operators/asyncio.html
>
> Quick google search revealed for example this:
>
>
> https://stackoverflow.com/questions/43067681/read-data-from-cassandra-for-processing-in-flink
>
> Piotrek
>
> > On 23 Nov 2018, at 10:33, eric hoffmann <si...@gmail.com>
> wrote:
> >
> > Hi
> > Is it possible to call batch job on a streaming context?
> > what i want to do is:
> > for a given input event, fetch cassandra elements based on event data,
> apply transformation on them and apply a ranking when all elements fetched
> by cassandra are processed.
> > If i do this in batch mode i would have to submit a job on each events
> and i can have an event every 45 seconds.
> > Is there any alternative? can i start a batch job that will receive some
> external request, process it and wait for another request?
> > thx
> > Eric
>
>

Re: Call batch job in streaming context?

Posted by Piotr Nowojski <pi...@data-artisans.com>.
Hi,

I’m not sure if I understand your problem and your context, but spawning a batch job every 45 seconds doesn’t sound as a that bad idea (as long as the job is short).

Another idea would be to incorporate this batch job inside your streaming job, for example by reading from Cassandra using an AsyncIO operator:
https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/operators/asyncio.html

Quick google search revealed for example this:

https://stackoverflow.com/questions/43067681/read-data-from-cassandra-for-processing-in-flink

Piotrek 

> On 23 Nov 2018, at 10:33, eric hoffmann <si...@gmail.com> wrote:
> 
> Hi
> Is it possible to call batch job on a streaming context?
> what i want to do is:
> for a given input event, fetch cassandra elements based on event data, apply transformation on them and apply a ranking when all elements fetched by cassandra are processed.
> If i do this in batch mode i would have to submit a job on each events and i can have an event every 45 seconds.
> Is there any alternative? can i start a batch job that will receive some external request, process it and wait for another request?
> thx
> Eric