You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Lari Hotari <lh...@apache.org> on 2023/06/27 10:52:02 UTC

Re: Pulsar doesn't support cgroup v2 which became the default in Kubernetes v1.25+ (AKS v1.25, GKE v1.26, EKS v1.26)

We don't have any released stable version of Pulsar with cgroup v2 support.

Azure AKS Kubernetes 1.25+ switches to use cgroup v2 [1] and AKS Kubernetes 1.24 goes End-of-life on July 30, 2023 [2]. This is why this is becoming urgent.

GKE will continue to have a way to select between cgroup v1 & cgroup v2 [3]. However even GKE will default to cgroup v2 in new Kubernetes 1.26 clusters or node pools.
AWS EKS v1.26 nodes will default to cgroup v2.

I have backported https://github.com/apache/pulsar/pull/16832 to branch-2.10 in this PR:
https://github.com/apache/pulsar/pull/20659
It requires including all refactorings for the Linux CPU, network and memory metrics. 

I'll proceed with cherry-picking https://github.com/apache/pulsar/pull/16832 to branch-3.0 and branch-2.11 . I don't think that supporting branch-2.9 is feasible since cgroup v2 support requires Java 11.0.16+.

-Lari

[1] https://github.com/Azure/AKS/releases/tag/2023-03-05
[2] https://learn.microsoft.com/en-us/azure/aks/supported-kubernetes-versions?tabs=azure-cli#aks-kubernetes-release-calendar
[3] https://cloud.google.com/kubernetes-engine/docs/how-to/node-system-config#cgroup-mode-options

On 2023/05/04 00:45:17 Frank Kelly wrote:
> This sounds like a very important issue for those of us seeking to use
> autoscaling - will the fix be back-ported to 2.11/2.10/2.9 etc?
> Alternatively is there a work-around?
> 
> -Frank
> 
> On Thu, Apr 27, 2023 at 2:37 AM Lari Hotari <lh...@apache.org> wrote:
> 
> > Thank you, Cong. That will be very helpful.
> >
> > -Lari
> >
> > On 2023/04/27 04:55:24 Cong Zhao wrote:
> > > Hi Lari Hotar,
> > >
> > > I would like to pick up this work, I will update
> > https://github.com/apache/pulsar/pull/16832 as soon.
> > >
> > > Thanks,
> > > Cong Zhao
> > >
> > > On 2023/04/26 15:17:37 Lari Hotari wrote:
> > > > Hi all,
> > > >
> > > > Pulsar doesn't support cgroup v2 which becomes default in Kubernetes
> > v1.25+.
> > > > Kubernetes announcement:
> > > > https://kubernetes.io/blog/2022/08/31/cgroupv2-ga-1-25/ .
> > > > Pulsar issue: https://github.com/apache/pulsar/issues/16601
> > > >
> > > > The impact of this is that the Pulsar load balancer won't have correct
> > > > CPU and memory information for making load balancing decisions.
> > > >
> > > > The cloud provider managed Kubernetes services have already switched
> > > > to cgroup v2 as the default. This happened in AKS v1.25, GKE v1.26 and
> > > > in EKS v1.26.
> > > > For GKE, it's possible to keep using cgroup v1 also in GKE v1.26
> > > > (
> > https://cloud.google.com/kubernetes-engine/docs/how-to/node-system-config#cgroup-mode-options
> > ).
> > > > For AKS and EKS, it's unknown whether such a configuration option
> > > > exists.
> > > >
> > > > There's a previous attempt in this PR to add cgroup v2 support to
> > > > Pulsar: https://github.com/apache/pulsar/pull/16832 . Would it be
> > > > possible to continue the work for supporting cgroup v2 in Pulsar
> > > > either with the existing PR or a new one?
> > > >
> > > > Who would like to pick up this work?
> > > > This is urgent since cgroup v2 is enabled by default for all latest
> > > > managed Kubernetes services (AKS v1.25, GKE v1.26 and EKS v1.26).
> > > >
> > > > Regards,
> > > >
> > > > -Lari
> > > >
> > >
> >
> 

Re: Pulsar doesn't support cgroup v2 which became the default in Kubernetes v1.25+ (AKS v1.25, GKE v1.26, EKS v1.26)

Posted by Enrico Olivelli <eo...@gmail.com>.
Il giorno mar 27 giu 2023 alle ore 12:58 Lari Hotari
<lh...@apache.org> ha scritto:
>
> I take my words back about branch-2.9 . It's also using Java 11 at runtime, so it should be possible to apply https://github.com/apache/pulsar/pull/20659 to branch-2.9 too so that we get cgroup v2 support for Pulsar 2.9.x.
>
> -Lari
>
> On 2023/06/27 10:52:02 Lari Hotari wrote:
> > We don't have any released stable version of Pulsar with cgroup v2 support.
> >
> > Azure AKS Kubernetes 1.25+ switches to use cgroup v2 [1] and AKS Kubernetes 1.24 goes End-of-life on July 30, 2023 [2]. This is why this is becoming urgent.
> >
> > GKE will continue to have a way to select between cgroup v1 & cgroup v2 [3]. However even GKE will default to cgroup v2 in new Kubernetes 1.26 clusters or node pools.
> > AWS EKS v1.26 nodes will default to cgroup v2.
> >
> > I have backported https://github.com/apache/pulsar/pull/16832 to branch-2.10 in this PR:
> > https://github.com/apache/pulsar/pull/20659
> > It requires including all refactorings for the Linux CPU, network and memory metrics.
> >
> > I'll proceed with cherry-picking https://github.com/apache/pulsar/pull/16832 to branch-3.0 and branch-2.11 . I don't think that supporting branch-2.9 is feasible since cgroup v2 support requires Java 11.0.16+.

Thank you very much

I agree that this is very important for k8s users

Enrico

> >
> > -Lari
> >
> > [1] https://github.com/Azure/AKS/releases/tag/2023-03-05
> > [2] https://learn.microsoft.com/en-us/azure/aks/supported-kubernetes-versions?tabs=azure-cli#aks-kubernetes-release-calendar
> > [3] https://cloud.google.com/kubernetes-engine/docs/how-to/node-system-config#cgroup-mode-options
> >
> > On 2023/05/04 00:45:17 Frank Kelly wrote:
> > > This sounds like a very important issue for those of us seeking to use
> > > autoscaling - will the fix be back-ported to 2.11/2.10/2.9 etc?
> > > Alternatively is there a work-around?
> > >
> > > -Frank
> > >
> > > On Thu, Apr 27, 2023 at 2:37 AM Lari Hotari <lh...@apache.org> wrote:
> > >
> > > > Thank you, Cong. That will be very helpful.
> > > >
> > > > -Lari
> > > >
> > > > On 2023/04/27 04:55:24 Cong Zhao wrote:
> > > > > Hi Lari Hotar,
> > > > >
> > > > > I would like to pick up this work, I will update
> > > > https://github.com/apache/pulsar/pull/16832 as soon.
> > > > >
> > > > > Thanks,
> > > > > Cong Zhao
> > > > >
> > > > > On 2023/04/26 15:17:37 Lari Hotari wrote:
> > > > > > Hi all,
> > > > > >
> > > > > > Pulsar doesn't support cgroup v2 which becomes default in Kubernetes
> > > > v1.25+.
> > > > > > Kubernetes announcement:
> > > > > > https://kubernetes.io/blog/2022/08/31/cgroupv2-ga-1-25/ .
> > > > > > Pulsar issue: https://github.com/apache/pulsar/issues/16601
> > > > > >
> > > > > > The impact of this is that the Pulsar load balancer won't have correct
> > > > > > CPU and memory information for making load balancing decisions.
> > > > > >
> > > > > > The cloud provider managed Kubernetes services have already switched
> > > > > > to cgroup v2 as the default. This happened in AKS v1.25, GKE v1.26 and
> > > > > > in EKS v1.26.
> > > > > > For GKE, it's possible to keep using cgroup v1 also in GKE v1.26
> > > > > > (
> > > > https://cloud.google.com/kubernetes-engine/docs/how-to/node-system-config#cgroup-mode-options
> > > > ).
> > > > > > For AKS and EKS, it's unknown whether such a configuration option
> > > > > > exists.
> > > > > >
> > > > > > There's a previous attempt in this PR to add cgroup v2 support to
> > > > > > Pulsar: https://github.com/apache/pulsar/pull/16832 . Would it be
> > > > > > possible to continue the work for supporting cgroup v2 in Pulsar
> > > > > > either with the existing PR or a new one?
> > > > > >
> > > > > > Who would like to pick up this work?
> > > > > > This is urgent since cgroup v2 is enabled by default for all latest
> > > > > > managed Kubernetes services (AKS v1.25, GKE v1.26 and EKS v1.26).
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > -Lari
> > > > > >
> > > > >
> > > >
> > >
> >

Re: Pulsar doesn't support cgroup v2 which became the default in Kubernetes v1.25+ (AKS v1.25, GKE v1.26, EKS v1.26)

Posted by Lari Hotari <lh...@apache.org>.
I take my words back about branch-2.9 . It's also using Java 11 at runtime, so it should be possible to apply https://github.com/apache/pulsar/pull/20659 to branch-2.9 too so that we get cgroup v2 support for Pulsar 2.9.x.

-Lari

On 2023/06/27 10:52:02 Lari Hotari wrote:
> We don't have any released stable version of Pulsar with cgroup v2 support.
> 
> Azure AKS Kubernetes 1.25+ switches to use cgroup v2 [1] and AKS Kubernetes 1.24 goes End-of-life on July 30, 2023 [2]. This is why this is becoming urgent.
> 
> GKE will continue to have a way to select between cgroup v1 & cgroup v2 [3]. However even GKE will default to cgroup v2 in new Kubernetes 1.26 clusters or node pools.
> AWS EKS v1.26 nodes will default to cgroup v2.
> 
> I have backported https://github.com/apache/pulsar/pull/16832 to branch-2.10 in this PR:
> https://github.com/apache/pulsar/pull/20659
> It requires including all refactorings for the Linux CPU, network and memory metrics. 
> 
> I'll proceed with cherry-picking https://github.com/apache/pulsar/pull/16832 to branch-3.0 and branch-2.11 . I don't think that supporting branch-2.9 is feasible since cgroup v2 support requires Java 11.0.16+.
> 
> -Lari
> 
> [1] https://github.com/Azure/AKS/releases/tag/2023-03-05
> [2] https://learn.microsoft.com/en-us/azure/aks/supported-kubernetes-versions?tabs=azure-cli#aks-kubernetes-release-calendar
> [3] https://cloud.google.com/kubernetes-engine/docs/how-to/node-system-config#cgroup-mode-options
> 
> On 2023/05/04 00:45:17 Frank Kelly wrote:
> > This sounds like a very important issue for those of us seeking to use
> > autoscaling - will the fix be back-ported to 2.11/2.10/2.9 etc?
> > Alternatively is there a work-around?
> > 
> > -Frank
> > 
> > On Thu, Apr 27, 2023 at 2:37 AM Lari Hotari <lh...@apache.org> wrote:
> > 
> > > Thank you, Cong. That will be very helpful.
> > >
> > > -Lari
> > >
> > > On 2023/04/27 04:55:24 Cong Zhao wrote:
> > > > Hi Lari Hotar,
> > > >
> > > > I would like to pick up this work, I will update
> > > https://github.com/apache/pulsar/pull/16832 as soon.
> > > >
> > > > Thanks,
> > > > Cong Zhao
> > > >
> > > > On 2023/04/26 15:17:37 Lari Hotari wrote:
> > > > > Hi all,
> > > > >
> > > > > Pulsar doesn't support cgroup v2 which becomes default in Kubernetes
> > > v1.25+.
> > > > > Kubernetes announcement:
> > > > > https://kubernetes.io/blog/2022/08/31/cgroupv2-ga-1-25/ .
> > > > > Pulsar issue: https://github.com/apache/pulsar/issues/16601
> > > > >
> > > > > The impact of this is that the Pulsar load balancer won't have correct
> > > > > CPU and memory information for making load balancing decisions.
> > > > >
> > > > > The cloud provider managed Kubernetes services have already switched
> > > > > to cgroup v2 as the default. This happened in AKS v1.25, GKE v1.26 and
> > > > > in EKS v1.26.
> > > > > For GKE, it's possible to keep using cgroup v1 also in GKE v1.26
> > > > > (
> > > https://cloud.google.com/kubernetes-engine/docs/how-to/node-system-config#cgroup-mode-options
> > > ).
> > > > > For AKS and EKS, it's unknown whether such a configuration option
> > > > > exists.
> > > > >
> > > > > There's a previous attempt in this PR to add cgroup v2 support to
> > > > > Pulsar: https://github.com/apache/pulsar/pull/16832 . Would it be
> > > > > possible to continue the work for supporting cgroup v2 in Pulsar
> > > > > either with the existing PR or a new one?
> > > > >
> > > > > Who would like to pick up this work?
> > > > > This is urgent since cgroup v2 is enabled by default for all latest
> > > > > managed Kubernetes services (AKS v1.25, GKE v1.26 and EKS v1.26).
> > > > >
> > > > > Regards,
> > > > >
> > > > > -Lari
> > > > >
> > > >
> > >
> > 
>