You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Robin Cassan <ro...@contentsquare.com> on 2022/06/29 07:52:19 UTC

Upgrading a job in rolling mode

Hi all! We are running a flink cluster on kubernetes and deploying a single
job on it through "flink run <jar>". Whenever we want to modify the jar, we
cancel the job and run the "flink run" command again, with the new jar, and
the retained checkpoint URL from the first run.
This works well, but this adds some unavoidable downtime for each update:
- Downtime in-between the "cancel" and the "run"
- Downtime during restoration of the state

We aim at reducing this downtime to a minimum and are wondering if there is
a way to submit a new version of a job without completely stopping the old
one, having each TM running the new jar one at a time and waiting for the
state to be restored before moving on to the next TM? This would limit the
downtime to only a single TM at a time

Otherwise, we could try to submit a second job and let it restore before
canceling the old one, but this raises complex synchronisation issues,
especially since the second job will be restoring the first job's retained
(incremental) checkpoint, which seems like a bad idea...

What would be the best way to reduce downtime in this scenario, in your
opinion?

Thanks!

Re: Upgrading a job in rolling mode

Posted by Robin Cassan <ro...@contentsquare.com>.

Hi Weihua! Thanks so much for your answer, it's pretty apparent that it's
absolutely not trivial to do with your explanation.
Have you (or any other) been able to apply something close to this
Green/Blue deployment idea from this Zero Downtime Upgrade presentation:
https://www.youtube.com/watch?v=Hyt3YrtKQAM ? And if so, how did you manage
to automate and orchestrate it (including rollbacking if the newer job
doesn't work properly for example)


Le mer. 29 juin 2022 à 15:38, Weihua Hu <hu...@gmail.com> a écrit :

> Hi, Robin
>
> This is a good question, but Flink can't do rolling upgrades.
>
> I'll try to explain the cost of Flink's support for RollingUpgrade.
> 1. There is a shuffle connection between tasks in a Region, and in order
> to ensure the consistency of the data processed by the upstream and
> downstream tasks in the same region, these tasks need to Failover/Restart
> at the same time. So the cost of RollingUpgrade for jobs with Keyby/Hash
> shuffle is not much different from the cost of stop/start jobs.
> 2. In order to maintain the accuracy of RollingUpgrade, many details need
> to be considered, such as checkpoint coordinator, failover handling, and
> operator coordinator.
>
> But, IMO, Job RollingUpgrade is a worthwhile thing to do.
>
> Best,
> Weihua
>
>
> On Wed, Jun 29, 2022 at 3:52 PM Robin Cassan <
> robin.cassan@contentsquare.com> wrote:
>
>> Hi all! We are running a flink cluster on kubernetes and deploying a
>> single job on it through "flink run <jar>". Whenever we want to modify the
>> jar, we cancel the job and run the "flink run" command again, with the new
>> jar, and the retained checkpoint URL from the first run.
>> This works well, but this adds some unavoidable downtime for each update:
>> - Downtime in-between the "cancel" and the "run"
>> - Downtime during restoration of the state
>>
>> We aim at reducing this downtime to a minimum and are wondering if there
>> is a way to submit a new version of a job without completely stopping the
>> old one, having each TM running the new jar one at a time and waiting for
>> the state to be restored before moving on to the next TM? This would limit
>> the downtime to only a single TM at a time
>>
>> Otherwise, we could try to submit a second job and let it restore before
>> canceling the old one, but this raises complex synchronisation issues,
>> especially since the second job will be restoring the first job's retained
>> (incremental) checkpoint, which seems like a bad idea...
>>
>> What would be the best way to reduce downtime in this scenario, in your
>> opinion?
>>
>> Thanks!
>>
>

Re: Upgrading a job in rolling mode

Posted by Weihua Hu <hu...@gmail.com>.

Hi, Robin

This is a good question, but Flink can't do rolling upgrades.

I'll try to explain the cost of Flink's support for RollingUpgrade.
1. There is a shuffle connection between tasks in a Region, and in order to
ensure the consistency of the data processed by the upstream and downstream
tasks in the same region, these tasks need to Failover/Restart at the same
time. So the cost of RollingUpgrade for jobs with Keyby/Hash shuffle is not
much different from the cost of stop/start jobs.
2. In order to maintain the accuracy of RollingUpgrade, many details need
to be considered, such as checkpoint coordinator, failover handling, and
operator coordinator.

But, IMO, Job RollingUpgrade is a worthwhile thing to do.

Best,
Weihua

On Wed, Jun 29, 2022 at 3:52 PM Robin Cassan <ro...@contentsquare.com>
wrote:

> Hi all! We are running a flink cluster on kubernetes and deploying a
> single job on it through "flink run <jar>". Whenever we want to modify the
> jar, we cancel the job and run the "flink run" command again, with the new
> jar, and the retained checkpoint URL from the first run.
> This works well, but this adds some unavoidable downtime for each update:
> - Downtime in-between the "cancel" and the "run"
> - Downtime during restoration of the state
>
> We aim at reducing this downtime to a minimum and are wondering if there
> is a way to submit a new version of a job without completely stopping the
> old one, having each TM running the new jar one at a time and waiting for
> the state to be restored before moving on to the next TM? This would limit
> the downtime to only a single TM at a time
>
> Otherwise, we could try to submit a second job and let it restore before
> canceling the old one, but this raises complex synchronisation issues,
> especially since the second job will be restoring the first job's retained
> (incremental) checkpoint, which seems like a bad idea...
>
> What would be the best way to reduce downtime in this scenario, in your
> opinion?
>
> Thanks!
>