You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@yunikorn.apache.org by Craig Condit <ap...@craigcondit.com> on 2022/01/19 16:44:25 UTC

[RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

Hi all,

The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has passed with 3 binding +1 votes and 3 non-binding +1 votes.

Vote thread: https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j <https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j>

Thank you to all the members who helped verify this release. We will move to IPMC voting shortly.


Thanks,
Craig



[WITHDRAWN] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

Posted by Craig Condit <ap...@craigcondit.com>.
All,

I was discussing this a bit more with Weiwei offline, and I think we are better off withdrawing 0.12.2 RC2 and releasing 0.12.2 RC3 instead. I will get a build together shortly and issue a new vote.

Thanks,

Craig


> On Jan 21, 2022, at 9:39 AM, Craig Condit <ap...@craigcondit.com> wrote:
> 
> Chaoran, nice catch on this one. Unfortunate that we didn’t find it before cutting 0.12.2.
> 
> I agree with Wilfred that we can add to the release notes on the website, but that we should back port to 0.12.3 as well. I can RM that release as well, unless someone else wants to volunteer.
> 
> - Craig
> 
> 
> 
>> On Jan 21, 2022, at 12:44 AM, Wilfred Spiegelenburg <wi...@apache.org> wrote:
>> 
>> We have seen large numbers of people running and deploying. I have
>> opened a PR with the fix.
>> The scheduler should not get deleted, unless scaled down on purpose.
>> It should not get evicted either, it should run as a high priority pod
>> unless we missed that.
>> Crashing of the scheduler is a bug,
>> 
>> We should let v0.12.2 go through as normal. In the release
>> announcement we should have a section that points to known issues and
>> we can reference the jira there with the workaround.
>> 
>> The workaround is as simple as a scale down and scale up. As long as
>> the admission controller is running all pods will be pushed towards
>> the YuniKorn scheduler. We can start on a next release on the branch
>> v0.12. We should get this case as part of our e2e tests added.
>> 
>> Wilfred
>> 
>> On Fri, 21 Jan 2022 at 17:15, Weiwei Yang <ww...@apache.org> wrote:
>>> 
>>> Agree, this needs to be fixed.
>>> Likely we need to revoke 0.12.2 and get out a 0.12.3.
>>> 
>>> On Thu, Jan 20, 2022 at 9:56 PM Chaoran Yu <yu...@gmail.com> wrote:
>>> 
>>>> Yes, Helm install and upgrade both work.
>>>> The failure scenario is as follows:
>>>> 
>>>> 1. Both the admission controller and the scheduler pods are running
>>>> 2. The scheduler pod is restarted for some reason (e.g. deleted, evicted,
>>>> or crashed)
>>>> 3. The new scheduler pod will be stuck in the pending state because it’s
>>>> intercepted by the admission controller (The schedulerName field is
>>>> yunikorn).
>>>> 
>>>> I think this bug is critical because if the scheduler pod fails for any
>>>> reason, someone has to manually redeploy the whole thing.
>>>> 
>>>> 
>>>>> On Jan 20, 2022, at 21:45, Weiwei Yang <ww...@apache.org> wrote:
>>>>> 
>>>>> Hmmm. that is a bug. But during the release verification, I have tried
>>>> the
>>>>> helm install, and that works as expected. I am guessing that is because
>>>> the
>>>>> scheduler always gets started first. Maybe the same for the upgrade? In
>>>>> this case, maybe this can work as long as people are using helm charts to
>>>>> deploy yunikorn? Craig, could you please look into this and let us know
>>>> if
>>>>> we need to revoke the vote for 0.12.2 and have a 0.12.3?
>>>>> 
>>>>> Thank you Chaoran to raise this up. Much appreciated!
>>>>> 
>>>>> On Thu, Jan 20, 2022 at 5:00 PM Chaoran Yu <yu...@gmail.com>
>>>> wrote:
>>>>> 
>>>>>> I just spotted a bug
>>>> https://issues.apache.org/jira/browse/YUNIKORN-1038.
>>>>>> which is critical and worth porting back into branch 0.12
>>>>>> 
>>>>>> On Thu, Jan 20, 2022 at 12:12 PM Sunil Govindan <su...@apache.org>
>>>> wrote:
>>>>>> 
>>>>>>> A late +1 (binding) from me.
>>>>>>> 
>>>>>>> I build this from source
>>>>>>> - Ran basic spark job
>>>>>>> - Verified UI
>>>>>>> - Checked signature.
>>>>>>> - Checked the images.
>>>>>>> 
>>>>>>> Thanks
>>>>>>> Sunil
>>>>>>> 
>>>>>>> On Wed, Jan 19, 2022 at 8:44 AM Craig Condit <ap...@craigcondit.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi all,
>>>>>>>> 
>>>>>>>> The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has passed
>>>>>>>> with 3 binding +1 votes and 3 non-binding +1 votes.
>>>>>>>> 
>>>>>>>> Vote thread:
>>>>>>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j <
>>>>>>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j>
>>>>>>>> 
>>>>>>>> Thank you to all the members who helped verify this release. We will
>>>>>> move
>>>>>>>> to IPMC voting shortly.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Craig
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
>>>> For additional commands, e-mail: dev-help@yunikorn.apache.org
>>>> 
>>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
>> For additional commands, e-mail: dev-help@yunikorn.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
> For additional commands, e-mail: dev-help@yunikorn.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org


Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

Posted by Weiwei Yang <ww...@apache.org>.
Hi all

I do not think we should get 0.12.2 out as we know we need to fix this bug.
Having 0.12.2 out with a known issue doesn't sound better than just
withdrawing it and re-release 0.12.2, using 0.12.2-RC3.
Can we just withdraw the IPMC vote and start 0.12.2-RC3 right away?

On Fri, Jan 21, 2022 at 7:39 AM Craig Condit <ap...@craigcondit.com> wrote:

> Chaoran, nice catch on this one. Unfortunate that we didn’t find it before
> cutting 0.12.2.
>
> I agree with Wilfred that we can add to the release notes on the website,
> but that we should back port to 0.12.3 as well. I can RM that release as
> well, unless someone else wants to volunteer.
>
> - Craig
>
>
>
> > On Jan 21, 2022, at 12:44 AM, Wilfred Spiegelenburg <wi...@apache.org>
> wrote:
> >
> > We have seen large numbers of people running and deploying. I have
> > opened a PR with the fix.
> > The scheduler should not get deleted, unless scaled down on purpose.
> > It should not get evicted either, it should run as a high priority pod
> > unless we missed that.
> > Crashing of the scheduler is a bug,
> >
> > We should let v0.12.2 go through as normal. In the release
> > announcement we should have a section that points to known issues and
> > we can reference the jira there with the workaround.
> >
> > The workaround is as simple as a scale down and scale up. As long as
> > the admission controller is running all pods will be pushed towards
> > the YuniKorn scheduler. We can start on a next release on the branch
> > v0.12. We should get this case as part of our e2e tests added.
> >
> > Wilfred
> >
> > On Fri, 21 Jan 2022 at 17:15, Weiwei Yang <ww...@apache.org> wrote:
> >>
> >> Agree, this needs to be fixed.
> >> Likely we need to revoke 0.12.2 and get out a 0.12.3.
> >>
> >> On Thu, Jan 20, 2022 at 9:56 PM Chaoran Yu <yu...@gmail.com>
> wrote:
> >>
> >>> Yes, Helm install and upgrade both work.
> >>> The failure scenario is as follows:
> >>>
> >>> 1. Both the admission controller and the scheduler pods are running
> >>> 2. The scheduler pod is restarted for some reason (e.g. deleted,
> evicted,
> >>> or crashed)
> >>> 3. The new scheduler pod will be stuck in the pending state because
> it’s
> >>> intercepted by the admission controller (The schedulerName field is
> >>> yunikorn).
> >>>
> >>> I think this bug is critical because if the scheduler pod fails for any
> >>> reason, someone has to manually redeploy the whole thing.
> >>>
> >>>
> >>>> On Jan 20, 2022, at 21:45, Weiwei Yang <ww...@apache.org> wrote:
> >>>>
> >>>> Hmmm. that is a bug. But during the release verification, I have tried
> >>> the
> >>>> helm install, and that works as expected. I am guessing that is
> because
> >>> the
> >>>> scheduler always gets started first. Maybe the same for the upgrade?
> In
> >>>> this case, maybe this can work as long as people are using helm
> charts to
> >>>> deploy yunikorn? Craig, could you please look into this and let us
> know
> >>> if
> >>>> we need to revoke the vote for 0.12.2 and have a 0.12.3?
> >>>>
> >>>> Thank you Chaoran to raise this up. Much appreciated!
> >>>>
> >>>> On Thu, Jan 20, 2022 at 5:00 PM Chaoran Yu <yu...@gmail.com>
> >>> wrote:
> >>>>
> >>>>> I just spotted a bug
> >>> https://issues.apache.org/jira/browse/YUNIKORN-1038.
> >>>>> which is critical and worth porting back into branch 0.12
> >>>>>
> >>>>> On Thu, Jan 20, 2022 at 12:12 PM Sunil Govindan <su...@apache.org>
> >>> wrote:
> >>>>>
> >>>>>> A late +1 (binding) from me.
> >>>>>>
> >>>>>> I build this from source
> >>>>>> - Ran basic spark job
> >>>>>> - Verified UI
> >>>>>> - Checked signature.
> >>>>>> - Checked the images.
> >>>>>>
> >>>>>> Thanks
> >>>>>> Sunil
> >>>>>>
> >>>>>> On Wed, Jan 19, 2022 at 8:44 AM Craig Condit <
> apache@craigcondit.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has
> passed
> >>>>>>> with 3 binding +1 votes and 3 non-binding +1 votes.
> >>>>>>>
> >>>>>>> Vote thread:
> >>>>>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j <
> >>>>>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j>
> >>>>>>>
> >>>>>>> Thank you to all the members who helped verify this release. We
> will
> >>>>> move
> >>>>>>> to IPMC voting shortly.
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Craig
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
> >>> For additional commands, e-mail: dev-help@yunikorn.apache.org
> >>>
> >>>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
> > For additional commands, e-mail: dev-help@yunikorn.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
> For additional commands, e-mail: dev-help@yunikorn.apache.org
>
>

Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

Posted by Craig Condit <ap...@craigcondit.com>.
Chaoran, nice catch on this one. Unfortunate that we didn’t find it before cutting 0.12.2.

I agree with Wilfred that we can add to the release notes on the website, but that we should back port to 0.12.3 as well. I can RM that release as well, unless someone else wants to volunteer.

- Craig



> On Jan 21, 2022, at 12:44 AM, Wilfred Spiegelenburg <wi...@apache.org> wrote:
> 
> We have seen large numbers of people running and deploying. I have
> opened a PR with the fix.
> The scheduler should not get deleted, unless scaled down on purpose.
> It should not get evicted either, it should run as a high priority pod
> unless we missed that.
> Crashing of the scheduler is a bug,
> 
> We should let v0.12.2 go through as normal. In the release
> announcement we should have a section that points to known issues and
> we can reference the jira there with the workaround.
> 
> The workaround is as simple as a scale down and scale up. As long as
> the admission controller is running all pods will be pushed towards
> the YuniKorn scheduler. We can start on a next release on the branch
> v0.12. We should get this case as part of our e2e tests added.
> 
> Wilfred
> 
> On Fri, 21 Jan 2022 at 17:15, Weiwei Yang <ww...@apache.org> wrote:
>> 
>> Agree, this needs to be fixed.
>> Likely we need to revoke 0.12.2 and get out a 0.12.3.
>> 
>> On Thu, Jan 20, 2022 at 9:56 PM Chaoran Yu <yu...@gmail.com> wrote:
>> 
>>> Yes, Helm install and upgrade both work.
>>> The failure scenario is as follows:
>>> 
>>> 1. Both the admission controller and the scheduler pods are running
>>> 2. The scheduler pod is restarted for some reason (e.g. deleted, evicted,
>>> or crashed)
>>> 3. The new scheduler pod will be stuck in the pending state because it’s
>>> intercepted by the admission controller (The schedulerName field is
>>> yunikorn).
>>> 
>>> I think this bug is critical because if the scheduler pod fails for any
>>> reason, someone has to manually redeploy the whole thing.
>>> 
>>> 
>>>> On Jan 20, 2022, at 21:45, Weiwei Yang <ww...@apache.org> wrote:
>>>> 
>>>> Hmmm. that is a bug. But during the release verification, I have tried
>>> the
>>>> helm install, and that works as expected. I am guessing that is because
>>> the
>>>> scheduler always gets started first. Maybe the same for the upgrade? In
>>>> this case, maybe this can work as long as people are using helm charts to
>>>> deploy yunikorn? Craig, could you please look into this and let us know
>>> if
>>>> we need to revoke the vote for 0.12.2 and have a 0.12.3?
>>>> 
>>>> Thank you Chaoran to raise this up. Much appreciated!
>>>> 
>>>> On Thu, Jan 20, 2022 at 5:00 PM Chaoran Yu <yu...@gmail.com>
>>> wrote:
>>>> 
>>>>> I just spotted a bug
>>> https://issues.apache.org/jira/browse/YUNIKORN-1038.
>>>>> which is critical and worth porting back into branch 0.12
>>>>> 
>>>>> On Thu, Jan 20, 2022 at 12:12 PM Sunil Govindan <su...@apache.org>
>>> wrote:
>>>>> 
>>>>>> A late +1 (binding) from me.
>>>>>> 
>>>>>> I build this from source
>>>>>> - Ran basic spark job
>>>>>> - Verified UI
>>>>>> - Checked signature.
>>>>>> - Checked the images.
>>>>>> 
>>>>>> Thanks
>>>>>> Sunil
>>>>>> 
>>>>>> On Wed, Jan 19, 2022 at 8:44 AM Craig Condit <ap...@craigcondit.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi all,
>>>>>>> 
>>>>>>> The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has passed
>>>>>>> with 3 binding +1 votes and 3 non-binding +1 votes.
>>>>>>> 
>>>>>>> Vote thread:
>>>>>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j <
>>>>>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j>
>>>>>>> 
>>>>>>> Thank you to all the members who helped verify this release. We will
>>>>> move
>>>>>>> to IPMC voting shortly.
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Craig
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
>>> For additional commands, e-mail: dev-help@yunikorn.apache.org
>>> 
>>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
> For additional commands, e-mail: dev-help@yunikorn.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org


Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

Posted by Wilfred Spiegelenburg <wi...@apache.org>.
We have seen large numbers of people running and deploying. I have
opened a PR with the fix.
The scheduler should not get deleted, unless scaled down on purpose.
It should not get evicted either, it should run as a high priority pod
unless we missed that.
Crashing of the scheduler is a bug,

We should let v0.12.2 go through as normal. In the release
announcement we should have a section that points to known issues and
we can reference the jira there with the workaround.

The workaround is as simple as a scale down and scale up. As long as
the admission controller is running all pods will be pushed towards
the YuniKorn scheduler. We can start on a next release on the branch
v0.12. We should get this case as part of our e2e tests added.

Wilfred

On Fri, 21 Jan 2022 at 17:15, Weiwei Yang <ww...@apache.org> wrote:
>
> Agree, this needs to be fixed.
> Likely we need to revoke 0.12.2 and get out a 0.12.3.
>
> On Thu, Jan 20, 2022 at 9:56 PM Chaoran Yu <yu...@gmail.com> wrote:
>
> > Yes, Helm install and upgrade both work.
> > The failure scenario is as follows:
> >
> > 1. Both the admission controller and the scheduler pods are running
> > 2. The scheduler pod is restarted for some reason (e.g. deleted, evicted,
> > or crashed)
> > 3. The new scheduler pod will be stuck in the pending state because it’s
> > intercepted by the admission controller (The schedulerName field is
> > yunikorn).
> >
> > I think this bug is critical because if the scheduler pod fails for any
> > reason, someone has to manually redeploy the whole thing.
> >
> >
> > > On Jan 20, 2022, at 21:45, Weiwei Yang <ww...@apache.org> wrote:
> > >
> > > Hmmm. that is a bug. But during the release verification, I have tried
> > the
> > > helm install, and that works as expected. I am guessing that is because
> > the
> > > scheduler always gets started first. Maybe the same for the upgrade? In
> > > this case, maybe this can work as long as people are using helm charts to
> > > deploy yunikorn? Craig, could you please look into this and let us know
> > if
> > > we need to revoke the vote for 0.12.2 and have a 0.12.3?
> > >
> > > Thank you Chaoran to raise this up. Much appreciated!
> > >
> > > On Thu, Jan 20, 2022 at 5:00 PM Chaoran Yu <yu...@gmail.com>
> > wrote:
> > >
> > >> I just spotted a bug
> > https://issues.apache.org/jira/browse/YUNIKORN-1038.
> > >> which is critical and worth porting back into branch 0.12
> > >>
> > >> On Thu, Jan 20, 2022 at 12:12 PM Sunil Govindan <su...@apache.org>
> > wrote:
> > >>
> > >>> A late +1 (binding) from me.
> > >>>
> > >>> I build this from source
> > >>> - Ran basic spark job
> > >>> - Verified UI
> > >>> - Checked signature.
> > >>> - Checked the images.
> > >>>
> > >>> Thanks
> > >>> Sunil
> > >>>
> > >>> On Wed, Jan 19, 2022 at 8:44 AM Craig Condit <ap...@craigcondit.com>
> > >>> wrote:
> > >>>
> > >>>> Hi all,
> > >>>>
> > >>>> The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has passed
> > >>>> with 3 binding +1 votes and 3 non-binding +1 votes.
> > >>>>
> > >>>> Vote thread:
> > >>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j <
> > >>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j>
> > >>>>
> > >>>> Thank you to all the members who helped verify this release. We will
> > >> move
> > >>>> to IPMC voting shortly.
> > >>>>
> > >>>>
> > >>>> Thanks,
> > >>>> Craig
> > >>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
> > For additional commands, e-mail: dev-help@yunikorn.apache.org
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org


Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

Posted by Weiwei Yang <ww...@apache.org>.
Agree, this needs to be fixed.
Likely we need to revoke 0.12.2 and get out a 0.12.3.

On Thu, Jan 20, 2022 at 9:56 PM Chaoran Yu <yu...@gmail.com> wrote:

> Yes, Helm install and upgrade both work.
> The failure scenario is as follows:
>
> 1. Both the admission controller and the scheduler pods are running
> 2. The scheduler pod is restarted for some reason (e.g. deleted, evicted,
> or crashed)
> 3. The new scheduler pod will be stuck in the pending state because it’s
> intercepted by the admission controller (The schedulerName field is
> yunikorn).
>
> I think this bug is critical because if the scheduler pod fails for any
> reason, someone has to manually redeploy the whole thing.
>
>
> > On Jan 20, 2022, at 21:45, Weiwei Yang <ww...@apache.org> wrote:
> >
> > Hmmm. that is a bug. But during the release verification, I have tried
> the
> > helm install, and that works as expected. I am guessing that is because
> the
> > scheduler always gets started first. Maybe the same for the upgrade? In
> > this case, maybe this can work as long as people are using helm charts to
> > deploy yunikorn? Craig, could you please look into this and let us know
> if
> > we need to revoke the vote for 0.12.2 and have a 0.12.3?
> >
> > Thank you Chaoran to raise this up. Much appreciated!
> >
> > On Thu, Jan 20, 2022 at 5:00 PM Chaoran Yu <yu...@gmail.com>
> wrote:
> >
> >> I just spotted a bug
> https://issues.apache.org/jira/browse/YUNIKORN-1038.
> >> which is critical and worth porting back into branch 0.12
> >>
> >> On Thu, Jan 20, 2022 at 12:12 PM Sunil Govindan <su...@apache.org>
> wrote:
> >>
> >>> A late +1 (binding) from me.
> >>>
> >>> I build this from source
> >>> - Ran basic spark job
> >>> - Verified UI
> >>> - Checked signature.
> >>> - Checked the images.
> >>>
> >>> Thanks
> >>> Sunil
> >>>
> >>> On Wed, Jan 19, 2022 at 8:44 AM Craig Condit <ap...@craigcondit.com>
> >>> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has passed
> >>>> with 3 binding +1 votes and 3 non-binding +1 votes.
> >>>>
> >>>> Vote thread:
> >>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j <
> >>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j>
> >>>>
> >>>> Thank you to all the members who helped verify this release. We will
> >> move
> >>>> to IPMC voting shortly.
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Craig
> >>>>
> >>>>
> >>>>
> >>>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
> For additional commands, e-mail: dev-help@yunikorn.apache.org
>
>

Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

Posted by Chaoran Yu <yu...@gmail.com>.
Yes, Helm install and upgrade both work.
The failure scenario is as follows:

1. Both the admission controller and the scheduler pods are running
2. The scheduler pod is restarted for some reason (e.g. deleted, evicted, or crashed)
3. The new scheduler pod will be stuck in the pending state because it’s intercepted by the admission controller (The schedulerName field is yunikorn).

I think this bug is critical because if the scheduler pod fails for any reason, someone has to manually redeploy the whole thing.


> On Jan 20, 2022, at 21:45, Weiwei Yang <ww...@apache.org> wrote:
> 
> Hmmm. that is a bug. But during the release verification, I have tried the
> helm install, and that works as expected. I am guessing that is because the
> scheduler always gets started first. Maybe the same for the upgrade? In
> this case, maybe this can work as long as people are using helm charts to
> deploy yunikorn? Craig, could you please look into this and let us know if
> we need to revoke the vote for 0.12.2 and have a 0.12.3?
> 
> Thank you Chaoran to raise this up. Much appreciated!
> 
> On Thu, Jan 20, 2022 at 5:00 PM Chaoran Yu <yu...@gmail.com> wrote:
> 
>> I just spotted a bug https://issues.apache.org/jira/browse/YUNIKORN-1038.
>> which is critical and worth porting back into branch 0.12
>> 
>> On Thu, Jan 20, 2022 at 12:12 PM Sunil Govindan <su...@apache.org> wrote:
>> 
>>> A late +1 (binding) from me.
>>> 
>>> I build this from source
>>> - Ran basic spark job
>>> - Verified UI
>>> - Checked signature.
>>> - Checked the images.
>>> 
>>> Thanks
>>> Sunil
>>> 
>>> On Wed, Jan 19, 2022 at 8:44 AM Craig Condit <ap...@craigcondit.com>
>>> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has passed
>>>> with 3 binding +1 votes and 3 non-binding +1 votes.
>>>> 
>>>> Vote thread:
>>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j <
>>>> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j>
>>>> 
>>>> Thank you to all the members who helped verify this release. We will
>> move
>>>> to IPMC voting shortly.
>>>> 
>>>> 
>>>> Thanks,
>>>> Craig
>>>> 
>>>> 
>>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org


Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

Posted by Weiwei Yang <ww...@apache.org>.
Hmmm. that is a bug. But during the release verification, I have tried the
helm install, and that works as expected. I am guessing that is because the
scheduler always gets started first. Maybe the same for the upgrade? In
this case, maybe this can work as long as people are using helm charts to
deploy yunikorn? Craig, could you please look into this and let us know if
we need to revoke the vote for 0.12.2 and have a 0.12.3?

Thank you Chaoran to raise this up. Much appreciated!

On Thu, Jan 20, 2022 at 5:00 PM Chaoran Yu <yu...@gmail.com> wrote:

> I just spotted a bug https://issues.apache.org/jira/browse/YUNIKORN-1038.
> which is critical and worth porting back into branch 0.12
>
> On Thu, Jan 20, 2022 at 12:12 PM Sunil Govindan <su...@apache.org> wrote:
>
> > A late +1 (binding) from me.
> >
> > I build this from source
> > - Ran basic spark job
> > - Verified UI
> > - Checked signature.
> > - Checked the images.
> >
> > Thanks
> > Sunil
> >
> > On Wed, Jan 19, 2022 at 8:44 AM Craig Condit <ap...@craigcondit.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has passed
> > > with 3 binding +1 votes and 3 non-binding +1 votes.
> > >
> > > Vote thread:
> > > https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j <
> > > https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j>
> > >
> > > Thank you to all the members who helped verify this release. We will
> move
> > > to IPMC voting shortly.
> > >
> > >
> > > Thanks,
> > > Craig
> > >
> > >
> > >
> >
>

Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

Posted by Chaoran Yu <yu...@gmail.com>.
I just spotted a bug https://issues.apache.org/jira/browse/YUNIKORN-1038.
which is critical and worth porting back into branch 0.12

On Thu, Jan 20, 2022 at 12:12 PM Sunil Govindan <su...@apache.org> wrote:

> A late +1 (binding) from me.
>
> I build this from source
> - Ran basic spark job
> - Verified UI
> - Checked signature.
> - Checked the images.
>
> Thanks
> Sunil
>
> On Wed, Jan 19, 2022 at 8:44 AM Craig Condit <ap...@craigcondit.com>
> wrote:
>
> > Hi all,
> >
> > The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has passed
> > with 3 binding +1 votes and 3 non-binding +1 votes.
> >
> > Vote thread:
> > https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j <
> > https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j>
> >
> > Thank you to all the members who helped verify this release. We will move
> > to IPMC voting shortly.
> >
> >
> > Thanks,
> > Craig
> >
> >
> >
>

Re: [RESULT] [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

Posted by Sunil Govindan <su...@apache.org>.
A late +1 (binding) from me.

I build this from source
- Ran basic spark job
- Verified UI
- Checked signature.
- Checked the images.

Thanks
Sunil

On Wed, Jan 19, 2022 at 8:44 AM Craig Condit <ap...@craigcondit.com> wrote:

> Hi all,
>
> The vote to Release Apache YuniKorn (incubating) 0.12.2 RC2 has passed
> with 3 binding +1 votes and 3 non-binding +1 votes.
>
> Vote thread:
> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j <
> https://lists.apache.org/thread/1gw0k0g5fy86r8ljnjttdco04w7z5j4j>
>
> Thank you to all the members who helped verify this release. We will move
> to IPMC voting shortly.
>
>
> Thanks,
> Craig
>
>
>