You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Maciek Próchniak <mp...@touk.pl> on 2016/03/08 15:48:32 UTC
rebalance of streaming job after taskManager restart
Hi,
we have streaming job with paralelism 2 and two task managers. The job
is occupying one slot on each task manager. When I stop manager2 the job
is restarted and it runs on manager1 - occupying two of it's slots.
How can I trigger restart (or other similar process) that will cause the
job to be balanced among task managers?
thanks,
maciek
Re: rebalance of streaming job after taskManager restart
Posted by Aljoscha Krettek <al...@apache.org>.
Yes, there are plans to make this more streamlined but we are not there yet, unfortunately.
> On 08 Mar 2016, at 16:07, Maciek Próchniak <mp...@touk.pl> wrote:
>
> Hi,
>
> thanks for quick answer - yes, I does what I want to accomplish,
> but I was hoping for some "easier" solution.
> Are there any plans for "restart" button/command or sth similar? I mean, the whole process of restarting is ready as I understand - as it's triggered when task manager dies.
>
> thanks,
> maciek
>
> On 08/03/2016 16:03, Aljoscha Krettek wrote:
>> Hi,
>> I think what you can do is make a savepoint of your program, then cancel it and restart it from the savepoint. This should make Flink redistribute it on all TaskManagers.
>>
>> See https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/savepoints.html
>> and
>> https://ci.apache.org/projects/flink/flink-docs-master/apis/cli.html#savepoints
>> for documentation about savepoints.
>>
>> The steps to follow should be:
>> bin/flink savepoint <your job id>
>>
>> this will print a savepoint path that you will need later.
>> bin/flink cancel <your job id>
>>
>> bin/flink run -s <savepoint path> …
>>
>> The last command is your usual run command but with the additional “-s” parameter to continue from a savepoint.
>>
>> I hope that helps.
>>
>> Cheers,
>> Aljoscha
>>> On 08 Mar 2016, at 15:48, Maciek Próchniak <mp...@touk.pl> wrote:
>>>
>>> Hi,
>>>
>>> we have streaming job with paralelism 2 and two task managers. The job is occupying one slot on each task manager. When I stop manager2 the job is restarted and it runs on manager1 - occupying two of it's slots.
>>> How can I trigger restart (or other similar process) that will cause the job to be balanced among task managers?
>>>
>>> thanks,
>>> maciek
>>
>
Re: rebalance of streaming job after taskManager restart
Posted by Maciek Próchniak <mp...@touk.pl>.
Hi,
thanks for quick answer - yes, I does what I want to accomplish,
but I was hoping for some "easier" solution.
Are there any plans for "restart" button/command or sth similar? I mean,
the whole process of restarting is ready as I understand - as it's
triggered when task manager dies.
thanks,
maciek
On 08/03/2016 16:03, Aljoscha Krettek wrote:
> Hi,
> I think what you can do is make a savepoint of your program, then cancel it and restart it from the savepoint. This should make Flink redistribute it on all TaskManagers.
>
> See https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/savepoints.html
> and
> https://ci.apache.org/projects/flink/flink-docs-master/apis/cli.html#savepoints
> for documentation about savepoints.
>
> The steps to follow should be:
>
> bin/flink savepoint <your job id>
>
> this will print a savepoint path that you will need later.
>
> bin/flink cancel <your job id>
>
> bin/flink run -s <savepoint path> …
>
> The last command is your usual run command but with the additional “-s” parameter to continue from a savepoint.
>
> I hope that helps.
>
> Cheers,
> Aljoscha
>> On 08 Mar 2016, at 15:48, Maciek Próchniak <mp...@touk.pl> wrote:
>>
>> Hi,
>>
>> we have streaming job with paralelism 2 and two task managers. The job is occupying one slot on each task manager. When I stop manager2 the job is restarted and it runs on manager1 - occupying two of it's slots.
>> How can I trigger restart (or other similar process) that will cause the job to be balanced among task managers?
>>
>> thanks,
>> maciek
>
Re: rebalance of streaming job after taskManager restart
Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
I think what you can do is make a savepoint of your program, then cancel it and restart it from the savepoint. This should make Flink redistribute it on all TaskManagers.
See https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/savepoints.html
and
https://ci.apache.org/projects/flink/flink-docs-master/apis/cli.html#savepoints
for documentation about savepoints.
The steps to follow should be:
bin/flink savepoint <your job id>
this will print a savepoint path that you will need later.
bin/flink cancel <your job id>
bin/flink run -s <savepoint path> …
The last command is your usual run command but with the additional “-s” parameter to continue from a savepoint.
I hope that helps.
Cheers,
Aljoscha
> On 08 Mar 2016, at 15:48, Maciek Próchniak <mp...@touk.pl> wrote:
>
> Hi,
>
> we have streaming job with paralelism 2 and two task managers. The job is occupying one slot on each task manager. When I stop manager2 the job is restarted and it runs on manager1 - occupying two of it's slots.
> How can I trigger restart (or other similar process) that will cause the job to be balanced among task managers?
>
> thanks,
> maciek