You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Zixuan Liu <no...@gmail.com> on 2024/03/15 18:00:12 UTC

Re: [DISCUSSION] Proposal to Replace Pulsar's Homegrown Configuration Framework with Gestalt Config

+1

This is a good idea that using Gestalt config, which supports json, yaml,
env and so on.

One notice that Gestalt requires Java 11 or higher.

Thanks,
Zixuan

Lari Hotari <lh...@apache.org>于2024年2月21日 周三17:27写道:

> Hello everyone,
>
> I would like to bring up an issue with Pulsar's containers, specifically
> regarding the method of overriding configurations. For instance, the
> Apache Pulsar Helm chart employs "bin/apply-config-from-env.py
> conf/broker.conf" and "bin/gen-yml-from-env.py
> conf/functions_worker.yml" [1] to apply configurations passed in the
> environment to the configuration files in the container's root file
> system.
> This approach fails when the container's root file system is read-only due
> to
> strict security policies (`readOnlyRootFilesystem` in
> `securityContext`). This issue has been reported as #22088 [2].
>
> A temporary fix could involve using a temporary file to modify the
> configuration file when the filesystem is read-only. However, the Python
> script solution is not ideal, and we should consider eliminating it. In
> the long term, it would also be beneficial to remove the need for a
> shell script to start Pulsar, but that's a separate issue.
>
> For configuration handling, we need a solution that can apply overrides
> in memory, eliminating the need to modify on-disk files. Modern
> configuration frameworks can do this out-of-the-box. Currently, Pulsar
> uses a homegrown configuration framework. Instead of extending this
> framework, I propose we discuss replacing it with the Gestalt Config
> library [3]. This library, licensed under Apache-2.0, is a mature,
> well-established solution for configuration handling.
>
> Switching to Gestalt Config would allow us to move towards a more
> structured and modular configuration in Pulsar. Our current
> configuration is not modular, as it relies on a "god object" for
> configuration, which collects all possible configuration options.
> Gestalt Config offers modular usage patterns similar to those of
> Spring Boot's external configuration [4] and the MicroProfile Config [5]
> in Quarkus. However, Gestalt Config does not pull in other dependencies,
> giving it an advantage over Spring Boot and Quarkus configuration
> solutions.
> Other libraries in this category include the Typesafe config library [6]
> from Lightbend with HOCON [7], commonly used in Scala and Akka-based
> applications.
>
> Gestalt Config supports many configuration file formats, including flat
> properties files, yaml, json, toml, and even hocon. It also offers
> security features for reading secrets directly from Vault, AWS Secrets
> Manager, and GCP Secret Manager, without the need to use the file system
> or environment variables to inject secrets into the application
> configuration. This could significantly improve Pulsar's security
> posture.
>
> Pulsar's current "homegrown configuration framework" is quite simple,
> implemented in a few classes with the main logic in
> PulsarConfigurationLoader [8] and FieldParser [9] classes, called from
> the PulsarBrokerStarter class [10].
>
> The main question is: should we continue extending Pulsar's homegrown
> configuration framework, or should we consider adopting a library like
> Gestalt Config for future configuration use case improvements for
> modularity, structured configuration, and security?
>
> Best regards,
>
> Lari
>
> References:
> 1 -
> https://github.com/apache/pulsar-helm-chart/blob/29ea17b3fceef65160620b9018d0dd0449a168c5/charts/pulsar/templates/broker-statefulset.yaml#L210-L221
> 2 - https://github.com/apache/pulsar/issues/22088
> 3 - https://github.com/gestalt-config/gestalt
> 4 -
> https://docs.spring.io/spring-boot/docs/current/reference/html/features.html#features.external-config
> 5 - https://microprofile.io/specifications/microprofile-config/
> 6 - https://github.com/lightbend/config
> 7 - https://github.com/lightbend/config/blob/main/HOCON.md
> 8 -
> https://github.com/apache/pulsar/blob/master/pulsar-broker-common/src/main/java/org/apache/pulsar/common/configuration/PulsarConfigurationLoader.java
> 9 -
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/util/FieldParser.java
> 10 -
> https://github.com/apache/pulsar/blob/db79096baaa3d7118aa026978a615ddc576f9183/pulsar-broker/src/main/java/org/apache/pulsar/PulsarBrokerStarter.java#L69-L76
>

Re: [DISCUSSION] Proposal to Replace Pulsar's Homegrown Configuration Framework with Gestalt Config

Posted by Matteo Merli <ma...@gmail.com>.
Pulsar broker already requires Java 17, so it's ok.

> Isn't a concern that this library is not popular, relatively new and
> maintained by a single individual?

This looks to be a non-trivial library but also not a big one :)

To me it doesn't look a big risk. If, down the road it gets abandoned we
can always either switch to a different library, or implement the
functionality ourselves (which is what we're doing now).

I think it would be great to use this library for aggregating all different
forms of config files. This would also already solve the problem brought up
in PIP-346, since we could have a `broker-defaults.conf` file and a
`broker.conf` with just few overrides, plus removing the
`apply-config-from-env.py` script.

We should also do this for BookKeeper though, otherwise it won't be as
useful.



--
Matteo Merli
<ma...@gmail.com>


On Fri, Mar 15, 2024 at 11:00 AM Zixuan Liu <no...@gmail.com> wrote:

> +1
>
> This is a good idea that using Gestalt config, which supports json, yaml,
> env and so on.
>
> One notice that Gestalt requires Java 11 or higher.
>
> Thanks,
> Zixuan
>
> Lari Hotari <lh...@apache.org>于2024年2月21日 周三17:27写道:
>
> > Hello everyone,
> >
> > I would like to bring up an issue with Pulsar's containers, specifically
> > regarding the method of overriding configurations. For instance, the
> > Apache Pulsar Helm chart employs "bin/apply-config-from-env.py
> > conf/broker.conf" and "bin/gen-yml-from-env.py
> > conf/functions_worker.yml" [1] to apply configurations passed in the
> > environment to the configuration files in the container's root file
> > system.
> > This approach fails when the container's root file system is read-only
> due
> > to
> > strict security policies (`readOnlyRootFilesystem` in
> > `securityContext`). This issue has been reported as #22088 [2].
> >
> > A temporary fix could involve using a temporary file to modify the
> > configuration file when the filesystem is read-only. However, the Python
> > script solution is not ideal, and we should consider eliminating it. In
> > the long term, it would also be beneficial to remove the need for a
> > shell script to start Pulsar, but that's a separate issue.
> >
> > For configuration handling, we need a solution that can apply overrides
> > in memory, eliminating the need to modify on-disk files. Modern
> > configuration frameworks can do this out-of-the-box. Currently, Pulsar
> > uses a homegrown configuration framework. Instead of extending this
> > framework, I propose we discuss replacing it with the Gestalt Config
> > library [3]. This library, licensed under Apache-2.0, is a mature,
> > well-established solution for configuration handling.
> >
> > Switching to Gestalt Config would allow us to move towards a more
> > structured and modular configuration in Pulsar. Our current
> > configuration is not modular, as it relies on a "god object" for
> > configuration, which collects all possible configuration options.
> > Gestalt Config offers modular usage patterns similar to those of
> > Spring Boot's external configuration [4] and the MicroProfile Config [5]
> > in Quarkus. However, Gestalt Config does not pull in other dependencies,
> > giving it an advantage over Spring Boot and Quarkus configuration
> > solutions.
> > Other libraries in this category include the Typesafe config library [6]
> > from Lightbend with HOCON [7], commonly used in Scala and Akka-based
> > applications.
> >
> > Gestalt Config supports many configuration file formats, including flat
> > properties files, yaml, json, toml, and even hocon. It also offers
> > security features for reading secrets directly from Vault, AWS Secrets
> > Manager, and GCP Secret Manager, without the need to use the file system
> > or environment variables to inject secrets into the application
> > configuration. This could significantly improve Pulsar's security
> > posture.
> >
> > Pulsar's current "homegrown configuration framework" is quite simple,
> > implemented in a few classes with the main logic in
> > PulsarConfigurationLoader [8] and FieldParser [9] classes, called from
> > the PulsarBrokerStarter class [10].
> >
> > The main question is: should we continue extending Pulsar's homegrown
> > configuration framework, or should we consider adopting a library like
> > Gestalt Config for future configuration use case improvements for
> > modularity, structured configuration, and security?
> >
> > Best regards,
> >
> > Lari
> >
> > References:
> > 1 -
> >
> https://github.com/apache/pulsar-helm-chart/blob/29ea17b3fceef65160620b9018d0dd0449a168c5/charts/pulsar/templates/broker-statefulset.yaml#L210-L221
> > 2 - https://github.com/apache/pulsar/issues/22088
> > 3 - https://github.com/gestalt-config/gestalt
> > 4 -
> >
> https://docs.spring.io/spring-boot/docs/current/reference/html/features.html#features.external-config
> > 5 - https://microprofile.io/specifications/microprofile-config/
> > 6 - https://github.com/lightbend/config
> > 7 - https://github.com/lightbend/config/blob/main/HOCON.md
> > 8 -
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker-common/src/main/java/org/apache/pulsar/common/configuration/PulsarConfigurationLoader.java
> > 9 -
> >
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/util/FieldParser.java
> > 10 -
> >
> https://github.com/apache/pulsar/blob/db79096baaa3d7118aa026978a615ddc576f9183/pulsar-broker/src/main/java/org/apache/pulsar/PulsarBrokerStarter.java#L69-L76
> >
>

Re: [DISCUSSION] Proposal to Replace Pulsar's Homegrown Configuration Framework with Gestalt Config

Posted by Nicolò Boschi <bo...@gmail.com>.
Isn't a concern that this library is not popular, relatively new and
maintained by a single individual?
https://github.com/gestalt-config/gestalt/graphs/contributors

I mean, to me it's ok but we need to take into consideration that we might
need to fork it and include it in the Pulsar project if the maintainer just
stops working on it.

Nicolò Boschi


Il giorno ven 15 mar 2024 alle ore 19:00 Zixuan Liu <no...@gmail.com> ha
scritto:

> +1
>
> This is a good idea that using Gestalt config, which supports json, yaml,
> env and so on.
>
> One notice that Gestalt requires Java 11 or higher.
>
> Thanks,
> Zixuan
>
> Lari Hotari <lh...@apache.org>于2024年2月21日 周三17:27写道:
>
> > Hello everyone,
> >
> > I would like to bring up an issue with Pulsar's containers, specifically
> > regarding the method of overriding configurations. For instance, the
> > Apache Pulsar Helm chart employs "bin/apply-config-from-env.py
> > conf/broker.conf" and "bin/gen-yml-from-env.py
> > conf/functions_worker.yml" [1] to apply configurations passed in the
> > environment to the configuration files in the container's root file
> > system.
> > This approach fails when the container's root file system is read-only
> due
> > to
> > strict security policies (`readOnlyRootFilesystem` in
> > `securityContext`). This issue has been reported as #22088 [2].
> >
> > A temporary fix could involve using a temporary file to modify the
> > configuration file when the filesystem is read-only. However, the Python
> > script solution is not ideal, and we should consider eliminating it. In
> > the long term, it would also be beneficial to remove the need for a
> > shell script to start Pulsar, but that's a separate issue.
> >
> > For configuration handling, we need a solution that can apply overrides
> > in memory, eliminating the need to modify on-disk files. Modern
> > configuration frameworks can do this out-of-the-box. Currently, Pulsar
> > uses a homegrown configuration framework. Instead of extending this
> > framework, I propose we discuss replacing it with the Gestalt Config
> > library [3]. This library, licensed under Apache-2.0, is a mature,
> > well-established solution for configuration handling.
> >
> > Switching to Gestalt Config would allow us to move towards a more
> > structured and modular configuration in Pulsar. Our current
> > configuration is not modular, as it relies on a "god object" for
> > configuration, which collects all possible configuration options.
> > Gestalt Config offers modular usage patterns similar to those of
> > Spring Boot's external configuration [4] and the MicroProfile Config [5]
> > in Quarkus. However, Gestalt Config does not pull in other dependencies,
> > giving it an advantage over Spring Boot and Quarkus configuration
> > solutions.
> > Other libraries in this category include the Typesafe config library [6]
> > from Lightbend with HOCON [7], commonly used in Scala and Akka-based
> > applications.
> >
> > Gestalt Config supports many configuration file formats, including flat
> > properties files, yaml, json, toml, and even hocon. It also offers
> > security features for reading secrets directly from Vault, AWS Secrets
> > Manager, and GCP Secret Manager, without the need to use the file system
> > or environment variables to inject secrets into the application
> > configuration. This could significantly improve Pulsar's security
> > posture.
> >
> > Pulsar's current "homegrown configuration framework" is quite simple,
> > implemented in a few classes with the main logic in
> > PulsarConfigurationLoader [8] and FieldParser [9] classes, called from
> > the PulsarBrokerStarter class [10].
> >
> > The main question is: should we continue extending Pulsar's homegrown
> > configuration framework, or should we consider adopting a library like
> > Gestalt Config for future configuration use case improvements for
> > modularity, structured configuration, and security?
> >
> > Best regards,
> >
> > Lari
> >
> > References:
> > 1 -
> >
> https://github.com/apache/pulsar-helm-chart/blob/29ea17b3fceef65160620b9018d0dd0449a168c5/charts/pulsar/templates/broker-statefulset.yaml#L210-L221
> > 2 - https://github.com/apache/pulsar/issues/22088
> > 3 - https://github.com/gestalt-config/gestalt
> > 4 -
> >
> https://docs.spring.io/spring-boot/docs/current/reference/html/features.html#features.external-config
> > 5 - https://microprofile.io/specifications/microprofile-config/
> > 6 - https://github.com/lightbend/config
> > 7 - https://github.com/lightbend/config/blob/main/HOCON.md
> > 8 -
> >
> https://github.com/apache/pulsar/blob/master/pulsar-broker-common/src/main/java/org/apache/pulsar/common/configuration/PulsarConfigurationLoader.java
> > 9 -
> >
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/util/FieldParser.java
> > 10 -
> >
> https://github.com/apache/pulsar/blob/db79096baaa3d7118aa026978a615ddc576f9183/pulsar-broker/src/main/java/org/apache/pulsar/PulsarBrokerStarter.java#L69-L76
> >
>