You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Michael Marshall <mm...@apache.org> on 2022/01/05 19:25:46 UTC

Re: [PR] Pulsar non root docker image

PR 13376 is ready for review, PTAL.

What approach should we take regarding docker image size?
I propose providing a minimal image along with documentation
on how to add your own debugging tools. Is that sufficient?

I'd like to include this feature in 2.10.0.

Note that you can test the new docker image here:
michaelmarshall/pulsar:2.10.0-SNAPSHOT-1dec8b9

Thanks,
Michael



On Wed, Dec 22, 2021 at 1:51 PM Michael Marshall <mm...@apache.org> wrote:
>
> Thanks for raising this important topic, Enrico.
>
> > Basically if you are running as non root, you cannot add tools on demand,
> > so we need to add basic tools, like netstat/vim/unzip.... otherwise when
> > you have a problem you are trapped.
>
> I agree that running as a non root user presents challenges for
> debugging. However, I don't think we should optimize our production
> image for debugging. We should instead optimize for a minimal docker
> image with as few dependencies as possible to decrease the image's
> attack surface.
>
> Also, I think it would be valuable to audit the current
> dependencies we already ship in our docker image. For example, this
> single `RUN` command [0] adds 1 GB of dependencies to our
> docker image.
>
> If there is a need for an image with debug tools, we could publish a
> "debug" image that extends our production image with extra tooling.
> Users could swap out the prod image with the debug image, as needed.
> However, it is pretty easy and quick to extend a public docker image
> to add your own tooling. By documenting how to extend our docker
> image, we could avoid decisions about which tools should be in our
> image.
>
> > there are ways to inject tools, but they are very hard to execute for
> > people who is not very experienced
>
> We can solve this through additional documentation. Freeznet asked a
> similar question about debugging on the PR. I provided some methods for
> debugging here [1].
>
> Thanks,
> Michael
>
> [0] https://github.com/apache/pulsar/blob/5dd60dbd748e446f8da396b448a5bb16a2ae6902/docker/pulsar/Dockerfile#L46-L55
> [1] https://github.com/apache/pulsar/pull/13376#discussion_r773580954
>
>
>
>
> On Wed, Dec 22, 2021 at 1:29 AM Enrico Olivelli <eo...@gmail.com> wrote:
> >
> > Michael,
> >
> > I would like to add this item to the list
> > https://github.com/apache/pulsar/pull/10815
> >
> > Basically if you are running as non root, you cannot add tools on demand,
> > so we need to add basic tools, like netstat/vim/unzip.... otherwise when
> > you have a problem you are trapped.
> >
> > there are ways to inject tools, but they are very hard to execute for
> > people who is not very experienced
> >
> > Enrico
> >
> >
> > Il giorno mer 22 dic 2021 alle ore 04:40 Haiting Jiang <
> > jianghaiting@apache.org> ha scritto:
> >
> > > > 1. Pulsar configuration is read in only from configuration files in
> > > > `/pulsar/conf`. A non root user must be able to update these files to
> > > > have run with custom configuration.
> > >
> > > About the configurations, I also see similar require like this lately [0].
> > > IMHO, update any configs without redeploy service is a useful feature.
> > > I would like post a PIP for this later.
> > > Basic idea is like make all configs dynamic by default except
> > > `metadataStoreUrl` and all configs are stored under path
> > > "/admin/configuration" in metadata store.
> > >
> > > [0] https://github.com/apache/pulsar/pull/13074
> > >
> > > On 2021/12/21 20:16:44 Michael Marshall wrote:
> > > > All tests are now passing for this PR [0]. I built the docker image
> > > > and pushed it to my personal repository to simplify testing [1] for
> > > > anyone interested in verifying the changes.
> > > >
> > > > I would like our docker image to be as close to immutable as possible.
> > > > As far as I can tell, here are the only reasons the image cannot be
> > > > immutable:
> > > >
> > > > 1. Pulsar configuration is read in only from configuration files in
> > > > `/pulsar/conf`. A non root user must be able to update these files to
> > > > have run with custom configuration.
> > > > 2. The Pulsar function worker unpacks nar files to
> > > > `/pulsar/download/pulsar_functions` by default.
> > > > 3. Pulsar tiered storage writes to `/pulsar` by default when using
> > > > filesystem storage.
> > > > 4. The Presto SQL worker writes to `/pulsar/lib/presto/` by default.
> > > > 5. Pulsar-admin and functions write to `/pulsar/log` by default
> > > > (possibly other components too).
> > > >
> > > > Note that even though bookkeepers and zookeepers write to
> > > > `/pulsar/data`, they should be writing to docker volumes, in which
> > > > case, the host's file system permissions are all that matter.
> > > >
> > > > If we can remove any of the above reasons, we can decrease the
> > > > privilege in the docker image.
> > > >
> > > > The PR has more detail. Please take a look.
> > > >
> > > > Thanks,
> > > > Michael
> > > >
> > > > [0] https://github.com/apache/pulsar/pull/13376
> > > > [1] michaelmarshall/pulsar:2.10.0-SNAPSHOT-1dec8b9
> > > >
> > > >
> > > > On Fri, Dec 17, 2021 at 12:33 AM Michael Marshall <mm...@apache.org>
> > > wrote:
> > > > >
> > > > > Hi Pulsar Community,
> > > > >
> > > > > I opened a PR to make our pulsar and pulsar-all docker images non root
> > > > > and OpenShift compliant [0]. As some may remember, we had issues with
> > > > > these changes before due to lack of testing. I plan to test thoroughly
> > > > > before we merge this PR, and it'd be great to have others test too. I
> > > > > published a build of my PR [1]. I also have an issue [2] tracking this
> > > > > work.
> > > > >
> > > > > Please take a look. I hope to make our 2.10 release a non root release!
> > > > >
> > > > > Thanks,
> > > > > Michael
> > > > >
> > > > > [0] https://github.com/apache/pulsar/pull/13376
> > > > > [1] michaelmarshall/pulsar:2.10.0-SNAPSHOT
> > > > > [2] https://github.com/apache/pulsar/issues/11269
> > > >
> > >

Re: [PR] Pulsar non root docker image

Posted by Michael Marshall <mm...@apache.org>.
> I am not sure if the fix in
> https://github.com/apache/pulsar/commit/04b5da0f95794259694cc781e8960b7e52fac06b
> is sufficient.

Are you able to describe your concerns so that I can address them?
That commit resolves the upgrade issue because the secret is readable
by any user in the container. Switching users on function upgrade
won't affect the secret's readability.

> I would like to see there is at least one integration test that covers
> running functions on k8s to ensure we don't break the basic stuff.

Are you asking for an integration test in the project or for an ad hoc
manual test?

Thanks,
Michael

On Wed, Jan 12, 2022 at 9:39 PM Sijie Guo <gu...@gmail.com> wrote:
>
> I am not sure if the fix in
> https://github.com/apache/pulsar/commit/04b5da0f95794259694cc781e8960b7e52fac06b
> is sufficient.
>
> I would like to see there is at least one integration test that covers
> running functions on k8s to ensure we don't break the basic stuff.
>
> - Sijie
>
> On Fri, Jan 7, 2022 at 9:58 PM Michael Marshall <mm...@apache.org>
> wrote:
>
> > > Is Functions being verified?
> >
> > I discussed this a bit in PR 13376's description and comments, please
> > let me know if you have additional questions. I haven't done any extra
> > verification.
> >
> > Note that since [0] is in both 2.8.x and 2.9.x, the upgrade scenario
> > that led us to revert the first non-root work is no longer an issue.
> >
> > I plan to follow up with a PR to the Pulsar Helm Chart to make the
> > fsGroup configurable.
> >
> > - Michael
> >
> > [0]
> > https://github.com/apache/pulsar/commit/04b5da0f95794259694cc781e8960b7e52fac06b
> >
> > On Thu, Jan 6, 2022 at 7:00 PM Sijie Guo <gu...@gmail.com> wrote:
> > >
> > > Is Functions being verified?
> > >
> > > - Sijie
> > >
> > > On Wed, Jan 5, 2022 at 11:26 AM Michael Marshall <mm...@apache.org>
> > > wrote:
> > >
> > > > PR 13376 is ready for review, PTAL.
> > > >
> > > > What approach should we take regarding docker image size?
> > > > I propose providing a minimal image along with documentation
> > > > on how to add your own debugging tools. Is that sufficient?
> > > >
> > > > I'd like to include this feature in 2.10.0.
> > > >
> > > > Note that you can test the new docker image here:
> > > > michaelmarshall/pulsar:2.10.0-SNAPSHOT-1dec8b9
> > > >
> > > > Thanks,
> > > > Michael
> > > >
> > > >
> > > >
> > > > On Wed, Dec 22, 2021 at 1:51 PM Michael Marshall <mmarshall@apache.org
> > >
> > > > wrote:
> > > > >
> > > > > Thanks for raising this important topic, Enrico.
> > > > >
> > > > > > Basically if you are running as non root, you cannot add tools on
> > > > demand,
> > > > > > so we need to add basic tools, like netstat/vim/unzip.... otherwise
> > > > when
> > > > > > you have a problem you are trapped.
> > > > >
> > > > > I agree that running as a non root user presents challenges for
> > > > > debugging. However, I don't think we should optimize our production
> > > > > image for debugging. We should instead optimize for a minimal docker
> > > > > image with as few dependencies as possible to decrease the image's
> > > > > attack surface.
> > > > >
> > > > > Also, I think it would be valuable to audit the current
> > > > > dependencies we already ship in our docker image. For example, this
> > > > > single `RUN` command [0] adds 1 GB of dependencies to our
> > > > > docker image.
> > > > >
> > > > > If there is a need for an image with debug tools, we could publish a
> > > > > "debug" image that extends our production image with extra tooling.
> > > > > Users could swap out the prod image with the debug image, as needed.
> > > > > However, it is pretty easy and quick to extend a public docker image
> > > > > to add your own tooling. By documenting how to extend our docker
> > > > > image, we could avoid decisions about which tools should be in our
> > > > > image.
> > > > >
> > > > > > there are ways to inject tools, but they are very hard to execute
> > for
> > > > > > people who is not very experienced
> > > > >
> > > > > We can solve this through additional documentation. Freeznet asked a
> > > > > similar question about debugging on the PR. I provided some methods
> > for
> > > > > debugging here [1].
> > > > >
> > > > > Thanks,
> > > > > Michael
> > > > >
> > > > > [0]
> > > >
> > https://github.com/apache/pulsar/blob/5dd60dbd748e446f8da396b448a5bb16a2ae6902/docker/pulsar/Dockerfile#L46-L55
> > > > > [1]
> > https://github.com/apache/pulsar/pull/13376#discussion_r773580954
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Dec 22, 2021 at 1:29 AM Enrico Olivelli <eolivelli@gmail.com
> > >
> > > > wrote:
> > > > > >
> > > > > > Michael,
> > > > > >
> > > > > > I would like to add this item to the list
> > > > > > https://github.com/apache/pulsar/pull/10815
> > > > > >
> > > > > > Basically if you are running as non root, you cannot add tools on
> > > > demand,
> > > > > > so we need to add basic tools, like netstat/vim/unzip.... otherwise
> > > > when
> > > > > > you have a problem you are trapped.
> > > > > >
> > > > > > there are ways to inject tools, but they are very hard to execute
> > for
> > > > > > people who is not very experienced
> > > > > >
> > > > > > Enrico
> > > > > >
> > > > > >
> > > > > > Il giorno mer 22 dic 2021 alle ore 04:40 Haiting Jiang <
> > > > > > jianghaiting@apache.org> ha scritto:
> > > > > >
> > > > > > > > 1. Pulsar configuration is read in only from configuration
> > files in
> > > > > > > > `/pulsar/conf`. A non root user must be able to update these
> > files
> > > > to
> > > > > > > > have run with custom configuration.
> > > > > > >
> > > > > > > About the configurations, I also see similar require like this
> > > > lately [0].
> > > > > > > IMHO, update any configs without redeploy service is a useful
> > > > feature.
> > > > > > > I would like post a PIP for this later.
> > > > > > > Basic idea is like make all configs dynamic by default except
> > > > > > > `metadataStoreUrl` and all configs are stored under path
> > > > > > > "/admin/configuration" in metadata store.
> > > > > > >
> > > > > > > [0] https://github.com/apache/pulsar/pull/13074
> > > > > > >
> > > > > > > On 2021/12/21 20:16:44 Michael Marshall wrote:
> > > > > > > > All tests are now passing for this PR [0]. I built the docker
> > image
> > > > > > > > and pushed it to my personal repository to simplify testing
> > [1] for
> > > > > > > > anyone interested in verifying the changes.
> > > > > > > >
> > > > > > > > I would like our docker image to be as close to immutable as
> > > > possible.
> > > > > > > > As far as I can tell, here are the only reasons the image
> > cannot be
> > > > > > > > immutable:
> > > > > > > >
> > > > > > > > 1. Pulsar configuration is read in only from configuration
> > files in
> > > > > > > > `/pulsar/conf`. A non root user must be able to update these
> > files
> > > > to
> > > > > > > > have run with custom configuration.
> > > > > > > > 2. The Pulsar function worker unpacks nar files to
> > > > > > > > `/pulsar/download/pulsar_functions` by default.
> > > > > > > > 3. Pulsar tiered storage writes to `/pulsar` by default when
> > using
> > > > > > > > filesystem storage.
> > > > > > > > 4. The Presto SQL worker writes to `/pulsar/lib/presto/` by
> > > > default.
> > > > > > > > 5. Pulsar-admin and functions write to `/pulsar/log` by default
> > > > > > > > (possibly other components too).
> > > > > > > >
> > > > > > > > Note that even though bookkeepers and zookeepers write to
> > > > > > > > `/pulsar/data`, they should be writing to docker volumes, in
> > which
> > > > > > > > case, the host's file system permissions are all that matter.
> > > > > > > >
> > > > > > > > If we can remove any of the above reasons, we can decrease the
> > > > > > > > privilege in the docker image.
> > > > > > > >
> > > > > > > > The PR has more detail. Please take a look.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Michael
> > > > > > > >
> > > > > > > > [0] https://github.com/apache/pulsar/pull/13376
> > > > > > > > [1] michaelmarshall/pulsar:2.10.0-SNAPSHOT-1dec8b9
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Dec 17, 2021 at 12:33 AM Michael Marshall <
> > > > mmarshall@apache.org>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Hi Pulsar Community,
> > > > > > > > >
> > > > > > > > > I opened a PR to make our pulsar and pulsar-all docker images
> > > > non root
> > > > > > > > > and OpenShift compliant [0]. As some may remember, we had
> > issues
> > > > with
> > > > > > > > > these changes before due to lack of testing. I plan to test
> > > > thoroughly
> > > > > > > > > before we merge this PR, and it'd be great to have others
> > test
> > > > too. I
> > > > > > > > > published a build of my PR [1]. I also have an issue [2]
> > > > tracking this
> > > > > > > > > work.
> > > > > > > > >
> > > > > > > > > Please take a look. I hope to make our 2.10 release a non
> > root
> > > > release!
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Michael
> > > > > > > > >
> > > > > > > > > [0] https://github.com/apache/pulsar/pull/13376
> > > > > > > > > [1] michaelmarshall/pulsar:2.10.0-SNAPSHOT
> > > > > > > > > [2] https://github.com/apache/pulsar/issues/11269
> > > > > > > >
> > > > > > >
> > > >
> >

Re: [PR] Pulsar non root docker image

Posted by Sijie Guo <gu...@gmail.com>.
I am not sure if the fix in
https://github.com/apache/pulsar/commit/04b5da0f95794259694cc781e8960b7e52fac06b
is sufficient.

I would like to see there is at least one integration test that covers
running functions on k8s to ensure we don't break the basic stuff.

- Sijie

On Fri, Jan 7, 2022 at 9:58 PM Michael Marshall <mm...@apache.org>
wrote:

> > Is Functions being verified?
>
> I discussed this a bit in PR 13376's description and comments, please
> let me know if you have additional questions. I haven't done any extra
> verification.
>
> Note that since [0] is in both 2.8.x and 2.9.x, the upgrade scenario
> that led us to revert the first non-root work is no longer an issue.
>
> I plan to follow up with a PR to the Pulsar Helm Chart to make the
> fsGroup configurable.
>
> - Michael
>
> [0]
> https://github.com/apache/pulsar/commit/04b5da0f95794259694cc781e8960b7e52fac06b
>
> On Thu, Jan 6, 2022 at 7:00 PM Sijie Guo <gu...@gmail.com> wrote:
> >
> > Is Functions being verified?
> >
> > - Sijie
> >
> > On Wed, Jan 5, 2022 at 11:26 AM Michael Marshall <mm...@apache.org>
> > wrote:
> >
> > > PR 13376 is ready for review, PTAL.
> > >
> > > What approach should we take regarding docker image size?
> > > I propose providing a minimal image along with documentation
> > > on how to add your own debugging tools. Is that sufficient?
> > >
> > > I'd like to include this feature in 2.10.0.
> > >
> > > Note that you can test the new docker image here:
> > > michaelmarshall/pulsar:2.10.0-SNAPSHOT-1dec8b9
> > >
> > > Thanks,
> > > Michael
> > >
> > >
> > >
> > > On Wed, Dec 22, 2021 at 1:51 PM Michael Marshall <mmarshall@apache.org
> >
> > > wrote:
> > > >
> > > > Thanks for raising this important topic, Enrico.
> > > >
> > > > > Basically if you are running as non root, you cannot add tools on
> > > demand,
> > > > > so we need to add basic tools, like netstat/vim/unzip.... otherwise
> > > when
> > > > > you have a problem you are trapped.
> > > >
> > > > I agree that running as a non root user presents challenges for
> > > > debugging. However, I don't think we should optimize our production
> > > > image for debugging. We should instead optimize for a minimal docker
> > > > image with as few dependencies as possible to decrease the image's
> > > > attack surface.
> > > >
> > > > Also, I think it would be valuable to audit the current
> > > > dependencies we already ship in our docker image. For example, this
> > > > single `RUN` command [0] adds 1 GB of dependencies to our
> > > > docker image.
> > > >
> > > > If there is a need for an image with debug tools, we could publish a
> > > > "debug" image that extends our production image with extra tooling.
> > > > Users could swap out the prod image with the debug image, as needed.
> > > > However, it is pretty easy and quick to extend a public docker image
> > > > to add your own tooling. By documenting how to extend our docker
> > > > image, we could avoid decisions about which tools should be in our
> > > > image.
> > > >
> > > > > there are ways to inject tools, but they are very hard to execute
> for
> > > > > people who is not very experienced
> > > >
> > > > We can solve this through additional documentation. Freeznet asked a
> > > > similar question about debugging on the PR. I provided some methods
> for
> > > > debugging here [1].
> > > >
> > > > Thanks,
> > > > Michael
> > > >
> > > > [0]
> > >
> https://github.com/apache/pulsar/blob/5dd60dbd748e446f8da396b448a5bb16a2ae6902/docker/pulsar/Dockerfile#L46-L55
> > > > [1]
> https://github.com/apache/pulsar/pull/13376#discussion_r773580954
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Dec 22, 2021 at 1:29 AM Enrico Olivelli <eolivelli@gmail.com
> >
> > > wrote:
> > > > >
> > > > > Michael,
> > > > >
> > > > > I would like to add this item to the list
> > > > > https://github.com/apache/pulsar/pull/10815
> > > > >
> > > > > Basically if you are running as non root, you cannot add tools on
> > > demand,
> > > > > so we need to add basic tools, like netstat/vim/unzip.... otherwise
> > > when
> > > > > you have a problem you are trapped.
> > > > >
> > > > > there are ways to inject tools, but they are very hard to execute
> for
> > > > > people who is not very experienced
> > > > >
> > > > > Enrico
> > > > >
> > > > >
> > > > > Il giorno mer 22 dic 2021 alle ore 04:40 Haiting Jiang <
> > > > > jianghaiting@apache.org> ha scritto:
> > > > >
> > > > > > > 1. Pulsar configuration is read in only from configuration
> files in
> > > > > > > `/pulsar/conf`. A non root user must be able to update these
> files
> > > to
> > > > > > > have run with custom configuration.
> > > > > >
> > > > > > About the configurations, I also see similar require like this
> > > lately [0].
> > > > > > IMHO, update any configs without redeploy service is a useful
> > > feature.
> > > > > > I would like post a PIP for this later.
> > > > > > Basic idea is like make all configs dynamic by default except
> > > > > > `metadataStoreUrl` and all configs are stored under path
> > > > > > "/admin/configuration" in metadata store.
> > > > > >
> > > > > > [0] https://github.com/apache/pulsar/pull/13074
> > > > > >
> > > > > > On 2021/12/21 20:16:44 Michael Marshall wrote:
> > > > > > > All tests are now passing for this PR [0]. I built the docker
> image
> > > > > > > and pushed it to my personal repository to simplify testing
> [1] for
> > > > > > > anyone interested in verifying the changes.
> > > > > > >
> > > > > > > I would like our docker image to be as close to immutable as
> > > possible.
> > > > > > > As far as I can tell, here are the only reasons the image
> cannot be
> > > > > > > immutable:
> > > > > > >
> > > > > > > 1. Pulsar configuration is read in only from configuration
> files in
> > > > > > > `/pulsar/conf`. A non root user must be able to update these
> files
> > > to
> > > > > > > have run with custom configuration.
> > > > > > > 2. The Pulsar function worker unpacks nar files to
> > > > > > > `/pulsar/download/pulsar_functions` by default.
> > > > > > > 3. Pulsar tiered storage writes to `/pulsar` by default when
> using
> > > > > > > filesystem storage.
> > > > > > > 4. The Presto SQL worker writes to `/pulsar/lib/presto/` by
> > > default.
> > > > > > > 5. Pulsar-admin and functions write to `/pulsar/log` by default
> > > > > > > (possibly other components too).
> > > > > > >
> > > > > > > Note that even though bookkeepers and zookeepers write to
> > > > > > > `/pulsar/data`, they should be writing to docker volumes, in
> which
> > > > > > > case, the host's file system permissions are all that matter.
> > > > > > >
> > > > > > > If we can remove any of the above reasons, we can decrease the
> > > > > > > privilege in the docker image.
> > > > > > >
> > > > > > > The PR has more detail. Please take a look.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Michael
> > > > > > >
> > > > > > > [0] https://github.com/apache/pulsar/pull/13376
> > > > > > > [1] michaelmarshall/pulsar:2.10.0-SNAPSHOT-1dec8b9
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Dec 17, 2021 at 12:33 AM Michael Marshall <
> > > mmarshall@apache.org>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > Hi Pulsar Community,
> > > > > > > >
> > > > > > > > I opened a PR to make our pulsar and pulsar-all docker images
> > > non root
> > > > > > > > and OpenShift compliant [0]. As some may remember, we had
> issues
> > > with
> > > > > > > > these changes before due to lack of testing. I plan to test
> > > thoroughly
> > > > > > > > before we merge this PR, and it'd be great to have others
> test
> > > too. I
> > > > > > > > published a build of my PR [1]. I also have an issue [2]
> > > tracking this
> > > > > > > > work.
> > > > > > > >
> > > > > > > > Please take a look. I hope to make our 2.10 release a non
> root
> > > release!
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Michael
> > > > > > > >
> > > > > > > > [0] https://github.com/apache/pulsar/pull/13376
> > > > > > > > [1] michaelmarshall/pulsar:2.10.0-SNAPSHOT
> > > > > > > > [2] https://github.com/apache/pulsar/issues/11269
> > > > > > >
> > > > > >
> > >
>

Re: [PR] Pulsar non root docker image

Posted by Michael Marshall <mm...@apache.org>.
> Is Functions being verified?

I discussed this a bit in PR 13376's description and comments, please
let me know if you have additional questions. I haven't done any extra
verification.

Note that since [0] is in both 2.8.x and 2.9.x, the upgrade scenario
that led us to revert the first non-root work is no longer an issue.

I plan to follow up with a PR to the Pulsar Helm Chart to make the
fsGroup configurable.

- Michael

[0] https://github.com/apache/pulsar/commit/04b5da0f95794259694cc781e8960b7e52fac06b

On Thu, Jan 6, 2022 at 7:00 PM Sijie Guo <gu...@gmail.com> wrote:
>
> Is Functions being verified?
>
> - Sijie
>
> On Wed, Jan 5, 2022 at 11:26 AM Michael Marshall <mm...@apache.org>
> wrote:
>
> > PR 13376 is ready for review, PTAL.
> >
> > What approach should we take regarding docker image size?
> > I propose providing a minimal image along with documentation
> > on how to add your own debugging tools. Is that sufficient?
> >
> > I'd like to include this feature in 2.10.0.
> >
> > Note that you can test the new docker image here:
> > michaelmarshall/pulsar:2.10.0-SNAPSHOT-1dec8b9
> >
> > Thanks,
> > Michael
> >
> >
> >
> > On Wed, Dec 22, 2021 at 1:51 PM Michael Marshall <mm...@apache.org>
> > wrote:
> > >
> > > Thanks for raising this important topic, Enrico.
> > >
> > > > Basically if you are running as non root, you cannot add tools on
> > demand,
> > > > so we need to add basic tools, like netstat/vim/unzip.... otherwise
> > when
> > > > you have a problem you are trapped.
> > >
> > > I agree that running as a non root user presents challenges for
> > > debugging. However, I don't think we should optimize our production
> > > image for debugging. We should instead optimize for a minimal docker
> > > image with as few dependencies as possible to decrease the image's
> > > attack surface.
> > >
> > > Also, I think it would be valuable to audit the current
> > > dependencies we already ship in our docker image. For example, this
> > > single `RUN` command [0] adds 1 GB of dependencies to our
> > > docker image.
> > >
> > > If there is a need for an image with debug tools, we could publish a
> > > "debug" image that extends our production image with extra tooling.
> > > Users could swap out the prod image with the debug image, as needed.
> > > However, it is pretty easy and quick to extend a public docker image
> > > to add your own tooling. By documenting how to extend our docker
> > > image, we could avoid decisions about which tools should be in our
> > > image.
> > >
> > > > there are ways to inject tools, but they are very hard to execute for
> > > > people who is not very experienced
> > >
> > > We can solve this through additional documentation. Freeznet asked a
> > > similar question about debugging on the PR. I provided some methods for
> > > debugging here [1].
> > >
> > > Thanks,
> > > Michael
> > >
> > > [0]
> > https://github.com/apache/pulsar/blob/5dd60dbd748e446f8da396b448a5bb16a2ae6902/docker/pulsar/Dockerfile#L46-L55
> > > [1] https://github.com/apache/pulsar/pull/13376#discussion_r773580954
> > >
> > >
> > >
> > >
> > > On Wed, Dec 22, 2021 at 1:29 AM Enrico Olivelli <eo...@gmail.com>
> > wrote:
> > > >
> > > > Michael,
> > > >
> > > > I would like to add this item to the list
> > > > https://github.com/apache/pulsar/pull/10815
> > > >
> > > > Basically if you are running as non root, you cannot add tools on
> > demand,
> > > > so we need to add basic tools, like netstat/vim/unzip.... otherwise
> > when
> > > > you have a problem you are trapped.
> > > >
> > > > there are ways to inject tools, but they are very hard to execute for
> > > > people who is not very experienced
> > > >
> > > > Enrico
> > > >
> > > >
> > > > Il giorno mer 22 dic 2021 alle ore 04:40 Haiting Jiang <
> > > > jianghaiting@apache.org> ha scritto:
> > > >
> > > > > > 1. Pulsar configuration is read in only from configuration files in
> > > > > > `/pulsar/conf`. A non root user must be able to update these files
> > to
> > > > > > have run with custom configuration.
> > > > >
> > > > > About the configurations, I also see similar require like this
> > lately [0].
> > > > > IMHO, update any configs without redeploy service is a useful
> > feature.
> > > > > I would like post a PIP for this later.
> > > > > Basic idea is like make all configs dynamic by default except
> > > > > `metadataStoreUrl` and all configs are stored under path
> > > > > "/admin/configuration" in metadata store.
> > > > >
> > > > > [0] https://github.com/apache/pulsar/pull/13074
> > > > >
> > > > > On 2021/12/21 20:16:44 Michael Marshall wrote:
> > > > > > All tests are now passing for this PR [0]. I built the docker image
> > > > > > and pushed it to my personal repository to simplify testing [1] for
> > > > > > anyone interested in verifying the changes.
> > > > > >
> > > > > > I would like our docker image to be as close to immutable as
> > possible.
> > > > > > As far as I can tell, here are the only reasons the image cannot be
> > > > > > immutable:
> > > > > >
> > > > > > 1. Pulsar configuration is read in only from configuration files in
> > > > > > `/pulsar/conf`. A non root user must be able to update these files
> > to
> > > > > > have run with custom configuration.
> > > > > > 2. The Pulsar function worker unpacks nar files to
> > > > > > `/pulsar/download/pulsar_functions` by default.
> > > > > > 3. Pulsar tiered storage writes to `/pulsar` by default when using
> > > > > > filesystem storage.
> > > > > > 4. The Presto SQL worker writes to `/pulsar/lib/presto/` by
> > default.
> > > > > > 5. Pulsar-admin and functions write to `/pulsar/log` by default
> > > > > > (possibly other components too).
> > > > > >
> > > > > > Note that even though bookkeepers and zookeepers write to
> > > > > > `/pulsar/data`, they should be writing to docker volumes, in which
> > > > > > case, the host's file system permissions are all that matter.
> > > > > >
> > > > > > If we can remove any of the above reasons, we can decrease the
> > > > > > privilege in the docker image.
> > > > > >
> > > > > > The PR has more detail. Please take a look.
> > > > > >
> > > > > > Thanks,
> > > > > > Michael
> > > > > >
> > > > > > [0] https://github.com/apache/pulsar/pull/13376
> > > > > > [1] michaelmarshall/pulsar:2.10.0-SNAPSHOT-1dec8b9
> > > > > >
> > > > > >
> > > > > > On Fri, Dec 17, 2021 at 12:33 AM Michael Marshall <
> > mmarshall@apache.org>
> > > > > wrote:
> > > > > > >
> > > > > > > Hi Pulsar Community,
> > > > > > >
> > > > > > > I opened a PR to make our pulsar and pulsar-all docker images
> > non root
> > > > > > > and OpenShift compliant [0]. As some may remember, we had issues
> > with
> > > > > > > these changes before due to lack of testing. I plan to test
> > thoroughly
> > > > > > > before we merge this PR, and it'd be great to have others test
> > too. I
> > > > > > > published a build of my PR [1]. I also have an issue [2]
> > tracking this
> > > > > > > work.
> > > > > > >
> > > > > > > Please take a look. I hope to make our 2.10 release a non root
> > release!
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Michael
> > > > > > >
> > > > > > > [0] https://github.com/apache/pulsar/pull/13376
> > > > > > > [1] michaelmarshall/pulsar:2.10.0-SNAPSHOT
> > > > > > > [2] https://github.com/apache/pulsar/issues/11269
> > > > > >
> > > > >
> >

Re: [PR] Pulsar non root docker image

Posted by Sijie Guo <gu...@gmail.com>.
Is Functions being verified?

- Sijie

On Wed, Jan 5, 2022 at 11:26 AM Michael Marshall <mm...@apache.org>
wrote:

> PR 13376 is ready for review, PTAL.
>
> What approach should we take regarding docker image size?
> I propose providing a minimal image along with documentation
> on how to add your own debugging tools. Is that sufficient?
>
> I'd like to include this feature in 2.10.0.
>
> Note that you can test the new docker image here:
> michaelmarshall/pulsar:2.10.0-SNAPSHOT-1dec8b9
>
> Thanks,
> Michael
>
>
>
> On Wed, Dec 22, 2021 at 1:51 PM Michael Marshall <mm...@apache.org>
> wrote:
> >
> > Thanks for raising this important topic, Enrico.
> >
> > > Basically if you are running as non root, you cannot add tools on
> demand,
> > > so we need to add basic tools, like netstat/vim/unzip.... otherwise
> when
> > > you have a problem you are trapped.
> >
> > I agree that running as a non root user presents challenges for
> > debugging. However, I don't think we should optimize our production
> > image for debugging. We should instead optimize for a minimal docker
> > image with as few dependencies as possible to decrease the image's
> > attack surface.
> >
> > Also, I think it would be valuable to audit the current
> > dependencies we already ship in our docker image. For example, this
> > single `RUN` command [0] adds 1 GB of dependencies to our
> > docker image.
> >
> > If there is a need for an image with debug tools, we could publish a
> > "debug" image that extends our production image with extra tooling.
> > Users could swap out the prod image with the debug image, as needed.
> > However, it is pretty easy and quick to extend a public docker image
> > to add your own tooling. By documenting how to extend our docker
> > image, we could avoid decisions about which tools should be in our
> > image.
> >
> > > there are ways to inject tools, but they are very hard to execute for
> > > people who is not very experienced
> >
> > We can solve this through additional documentation. Freeznet asked a
> > similar question about debugging on the PR. I provided some methods for
> > debugging here [1].
> >
> > Thanks,
> > Michael
> >
> > [0]
> https://github.com/apache/pulsar/blob/5dd60dbd748e446f8da396b448a5bb16a2ae6902/docker/pulsar/Dockerfile#L46-L55
> > [1] https://github.com/apache/pulsar/pull/13376#discussion_r773580954
> >
> >
> >
> >
> > On Wed, Dec 22, 2021 at 1:29 AM Enrico Olivelli <eo...@gmail.com>
> wrote:
> > >
> > > Michael,
> > >
> > > I would like to add this item to the list
> > > https://github.com/apache/pulsar/pull/10815
> > >
> > > Basically if you are running as non root, you cannot add tools on
> demand,
> > > so we need to add basic tools, like netstat/vim/unzip.... otherwise
> when
> > > you have a problem you are trapped.
> > >
> > > there are ways to inject tools, but they are very hard to execute for
> > > people who is not very experienced
> > >
> > > Enrico
> > >
> > >
> > > Il giorno mer 22 dic 2021 alle ore 04:40 Haiting Jiang <
> > > jianghaiting@apache.org> ha scritto:
> > >
> > > > > 1. Pulsar configuration is read in only from configuration files in
> > > > > `/pulsar/conf`. A non root user must be able to update these files
> to
> > > > > have run with custom configuration.
> > > >
> > > > About the configurations, I also see similar require like this
> lately [0].
> > > > IMHO, update any configs without redeploy service is a useful
> feature.
> > > > I would like post a PIP for this later.
> > > > Basic idea is like make all configs dynamic by default except
> > > > `metadataStoreUrl` and all configs are stored under path
> > > > "/admin/configuration" in metadata store.
> > > >
> > > > [0] https://github.com/apache/pulsar/pull/13074
> > > >
> > > > On 2021/12/21 20:16:44 Michael Marshall wrote:
> > > > > All tests are now passing for this PR [0]. I built the docker image
> > > > > and pushed it to my personal repository to simplify testing [1] for
> > > > > anyone interested in verifying the changes.
> > > > >
> > > > > I would like our docker image to be as close to immutable as
> possible.
> > > > > As far as I can tell, here are the only reasons the image cannot be
> > > > > immutable:
> > > > >
> > > > > 1. Pulsar configuration is read in only from configuration files in
> > > > > `/pulsar/conf`. A non root user must be able to update these files
> to
> > > > > have run with custom configuration.
> > > > > 2. The Pulsar function worker unpacks nar files to
> > > > > `/pulsar/download/pulsar_functions` by default.
> > > > > 3. Pulsar tiered storage writes to `/pulsar` by default when using
> > > > > filesystem storage.
> > > > > 4. The Presto SQL worker writes to `/pulsar/lib/presto/` by
> default.
> > > > > 5. Pulsar-admin and functions write to `/pulsar/log` by default
> > > > > (possibly other components too).
> > > > >
> > > > > Note that even though bookkeepers and zookeepers write to
> > > > > `/pulsar/data`, they should be writing to docker volumes, in which
> > > > > case, the host's file system permissions are all that matter.
> > > > >
> > > > > If we can remove any of the above reasons, we can decrease the
> > > > > privilege in the docker image.
> > > > >
> > > > > The PR has more detail. Please take a look.
> > > > >
> > > > > Thanks,
> > > > > Michael
> > > > >
> > > > > [0] https://github.com/apache/pulsar/pull/13376
> > > > > [1] michaelmarshall/pulsar:2.10.0-SNAPSHOT-1dec8b9
> > > > >
> > > > >
> > > > > On Fri, Dec 17, 2021 at 12:33 AM Michael Marshall <
> mmarshall@apache.org>
> > > > wrote:
> > > > > >
> > > > > > Hi Pulsar Community,
> > > > > >
> > > > > > I opened a PR to make our pulsar and pulsar-all docker images
> non root
> > > > > > and OpenShift compliant [0]. As some may remember, we had issues
> with
> > > > > > these changes before due to lack of testing. I plan to test
> thoroughly
> > > > > > before we merge this PR, and it'd be great to have others test
> too. I
> > > > > > published a build of my PR [1]. I also have an issue [2]
> tracking this
> > > > > > work.
> > > > > >
> > > > > > Please take a look. I hope to make our 2.10 release a non root
> release!
> > > > > >
> > > > > > Thanks,
> > > > > > Michael
> > > > > >
> > > > > > [0] https://github.com/apache/pulsar/pull/13376
> > > > > > [1] michaelmarshall/pulsar:2.10.0-SNAPSHOT
> > > > > > [2] https://github.com/apache/pulsar/issues/11269
> > > > >
> > > >
>