You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Evgeny Kincharov <Ev...@epam.com> on 2017/02/27 13:04:21 UTC
Running streaming job on every node of cluster
Hi,
I have the simplest streaming job, and I want to distribute my job on every node of my Flink cluster.
Job is simple:
source (SocketTextStream) -> map -> sink (AsyncHtttpSink).
When I increase parallelism of my job when deploying or directly in code, no effect because source is can't work in parallel.
Now I reduce "Tasks Slots" to 1 on ever nodes and deploy my job as many times as nodes in the cluster.
It works when I have only one job. If I want deploy another in parallel there is no free slots.
I hope more convenient way to do that is exists. Thanks.
BR,
Evgeny
Re: Running streaming job on every node of cluster
Posted by Nico Kruber <ni...@data-artisans.com>.
Hi Evgeny,
I tried to reproduce your example with the following code, having another
console listening with "nc -l 12345"
env.setParallelism(2);
env.addSource(new SocketTextStreamFunction("localhost", 12345, " ", 3))
.map(new MapFunction<String, String>() {
@Override
public String map(final String s) throws Exception { return s; }
})
.addSink(new DiscardingSink<String>());
This way, I do get a source with parallelism 1 and map & sink with parallelism
2 and the whole program accompanying 2 slots as expected. You can check in the
web interface of your cluster how many slots are taken after executing one
instance of your program.
How do you set your parallelism?
Nico
On Monday, 27 February 2017 14:04:21 CET Evgeny Kincharov wrote:
> Hi,
>
> I have the simplest streaming job, and I want to distribute my job on every
> node of my Flink cluster.
>
> Job is simple:
>
> source (SocketTextStream) -> map -> sink (AsyncHtttpSink).
>
> When I increase parallelism of my job when deploying or directly in code, no
> effect because source is can't work in parallel. Now I reduce "Tasks Slots"
> to 1 on ever nodes and deploy my job as many times as nodes in the cluster.
> It works when I have only one job. If I want deploy another in parallel
> there is no free slots. I hope more convenient way to do that is exists.
> Thanks.
>
> BR,
> Evgeny