You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@beam.apache.org by Hannah Jiang <ha...@google.com> on 2020/01/10 00:48:07 UTC

Cleaning up SDK docker image tagging

Hi Community

Now we are using different default tags for Python(version or version.dev),
Java(version-SNAPSHOT) and Go(latest). I would like to clean it up and make
it consistent for all languages and here is my proposal.

For the released version of SDKs, the default tag will be version number.
(ex: 2.17.0)
For the unreleased version of SDKs, the default tag will be version number
+ '.dev'. (ex: 2.18.0.dev)

The default tag is used 1). when we build docker images without specifying
a tag. 2) when we run a job with runners running on dockers with default
docker images.

Additionally, Beam will always lookup images locally before pulling one
from remote, so the images built locally will not be overwritten by remote
ones.

This has a minor downside for the users who are using unreleased versions.
They need to build a local image first before using docker to run. I will
add a clear error message to show the problem and add a link to a
documentation of how to create images.

I would like to collect feedback from whoever uses dockers. Does this sound
good? Is there anything I am missing?

Thanks,
Hannah

Re: Cleaning up SDK docker image tagging

Posted by Robert Bradshaw <ro...@google.com>.

On Fri, Jan 10, 2020 at 3:30 PM Kyle Weaver <kc...@google.com> wrote:
>
> > Does cloning a release, modifying the docker file, and building the
> > containers create a "new" container with a default release tag? If so,
> > we should discourage that
>
> Yes, and agreed. The doc you linked already mentions how to customize tags, maybe we could also recommend the user always makes their own tag whenever changing a released image.

I think we should discourage checking out the code and modifying the
docker file in pace, but that's another discussion.

> On Fri, Jan 10, 2020 at 2:33 PM Robert Bradshaw <ro...@google.com> wrote:
>>
>> On Fri, Jan 10, 2020 at 12:48 PM Kyle Weaver <kc...@google.com> wrote:
>> >
>> > > Shall we ALSO tag the image with git commit version for local build to keep track of obsolete images.
>> >
>> > This would mean we would have to be able to access the git commit from the source, which might not be trivial (right now the Beam version e.g. "2.18.0.dev" is hard-coded in some properties files). And the way it is now keeps things simple and easy to read.
>>
>> It also means that as you're developing, you don't generate a long
>> trail of named containers that you'll never access again but are
>> harder to automatically prune.
>>
>> > > we can assume the images with the same tag are always identical
>>
>> This is only true if a developer never builds a container without
>> committing any local changes first.
>>
>> Image tags are like git tags. They also have hashes (like commit ids)
>> if one wants to ensure one is pointing to the exact same thing.
>>
>> > So far that's always been the case, but in case there are problems with the published container images and we have to update them, we want to make sure everyone gets the most up-to-date image [1].
>> >
>> > > 1. pull only when needed, so reduce unnecessary traffic for users.
>> >
>> > `docker pull` starts by checking if the local image is up-to-date with the remote, and most of the time it will be, so no more network usage beyond that is needed.
>> >
>> > > In case a user customize the image and rebuild it with the default tag
>> >
>> > The user should never need to build an image with the default release tag (e.g. 2.17.0). They will use either the .dev tag (the default) or even better, their own custom tag. (I suppose we can't stop users from manually tagging their own container with the release tag, but most people should know better.)
>>
>> Does cloning a release, modifying the docker file, and building the
>> containers create a "new" container with a default release tag? If so,
>> we should discourage that:
>> https://beam.apache.org/documentation/runtime/environments/#modifying-dockerfiles
>>
>> > > make it consistent for all languages
>> >
>> > Forgot to reply to this point -- I agree, +1.
>>
>> Also +1
>>
>> > [1] https://lists.apache.org/thread.html/7b5599f142785e616a1e943ff1c3da5213de370ed193373e01991bb6%40%3Cdev.beam.apache.org%3E
>> >
>> > On Fri, Jan 10, 2020 at 9:52 AM Hannah Jiang <ha...@google.com> wrote:
>> >>
>> >> >> This has a minor downside for the users who are using unreleased versions. They need to build a local image first before using docker to run.
>> >> > Isn't that the current behavior?
>> >>
>> >> Our current behavior is pull & run. So in case both local and remote images are available, the local image is getting overwritten by the remote image.
>> >> A New approach will do run only, which will pull remote images only when local images are not available. Since we don't deploy different images with the same tag, we can assume the images with the same tag are always identical, unless a user customized it with the same tag.
>> >>
>> >> This has the following advantages.
>> >> 1. pull only when needed, so reduce unnecessary traffic for users.
>> >> 2. In case a user customize the image and rebuild it with the default tag, the local customized image is used as expected. With pull & run, remote image, instead of the customized image, is used.
>> >>
>> >> On Thu, Jan 9, 2020 at 4:54 PM Kyle Weaver <kc...@google.com> wrote:
>> >>>
>> >>> > This has a minor downside for the users who are using unreleased versions. They need to build a local image first before using docker to run.
>> >>>
>> >>> Isn't that the current behavior?
>> >>>
>> >>> On Thu, Jan 9, 2020 at 4:48 PM Hannah Jiang <ha...@google.com> wrote:
>> >>>>
>> >>>> Hi Community
>> >>>>
>> >>>> Now we are using different default tags for Python(version or version.dev), Java(version-SNAPSHOT) and Go(latest). I would like to clean it up and make it consistent for all languages and here is my proposal.
>> >>>>
>> >>>> For the released version of SDKs, the default tag will be version number. (ex: 2.17.0)
>> >>>> For the unreleased version of SDKs, the default tag will be version number + '.dev'. (ex: 2.18.0.dev)
>> >>>>
>> >>>> The default tag is used 1). when we build docker images without specifying a tag. 2) when we run a job with runners running on dockers with default docker images.
>> >>>>
>> >>>> Additionally, Beam will always lookup images locally before pulling one from remote, so the images built locally will not be overwritten by remote ones.
>> >>>>
>> >>>> This has a minor downside for the users who are using unreleased versions. They need to build a local image first before using docker to run. I will add a clear error message to show the problem and add a link to a documentation of how to create images.
>> >>>>
>> >>>> I would like to collect feedback from whoever uses dockers. Does this sound good? Is there anything I am missing?
>> >>>>
>> >>>> Thanks,
>> >>>> Hannah
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>

Re: Cleaning up SDK docker image tagging

Posted by Kyle Weaver <kc...@google.com>.

> Does cloning a release, modifying the docker file, and building the
> containers create a "new" container with a default release tag? If so,
> we should discourage that

Yes, and agreed. The doc you linked already mentions how to customize tags,
maybe we could also recommend the user always makes their own tag whenever
changing a released image.

On Fri, Jan 10, 2020 at 2:33 PM Robert Bradshaw <ro...@google.com> wrote:

> On Fri, Jan 10, 2020 at 12:48 PM Kyle Weaver <kc...@google.com> wrote:
> >
> > > Shall we ALSO tag the image with git commit version for local build to
> keep track of obsolete images.
> >
> > This would mean we would have to be able to access the git commit from
> the source, which might not be trivial (right now the Beam version e.g. "
> 2.18.0.dev" is hard-coded in some properties files). And the way it is
> now keeps things simple and easy to read.
>
> It also means that as you're developing, you don't generate a long
> trail of named containers that you'll never access again but are
> harder to automatically prune.
>
> > > we can assume the images with the same tag are always identical
>
> This is only true if a developer never builds a container without
> committing any local changes first.
>
> Image tags are like git tags. They also have hashes (like commit ids)
> if one wants to ensure one is pointing to the exact same thing.
>
> > So far that's always been the case, but in case there are problems with
> the published container images and we have to update them, we want to make
> sure everyone gets the most up-to-date image [1].
> >
> > > 1. pull only when needed, so reduce unnecessary traffic for users.
> >
> > `docker pull` starts by checking if the local image is up-to-date with
> the remote, and most of the time it will be, so no more network usage
> beyond that is needed.
> >
> > > In case a user customize the image and rebuild it with the default tag
> >
> > The user should never need to build an image with the default release
> tag (e.g. 2.17.0). They will use either the .dev tag (the default) or even
> better, their own custom tag. (I suppose we can't stop users from manually
> tagging their own container with the release tag, but most people should
> know better.)
>
> Does cloning a release, modifying the docker file, and building the
> containers create a "new" container with a default release tag? If so,
> we should discourage that:
>
> https://beam.apache.org/documentation/runtime/environments/#modifying-dockerfiles
>
> > > make it consistent for all languages
> >
> > Forgot to reply to this point -- I agree, +1.
>
> Also +1
>
> > [1]
> https://lists.apache.org/thread.html/7b5599f142785e616a1e943ff1c3da5213de370ed193373e01991bb6%40%3Cdev.beam.apache.org%3E
> >
> > On Fri, Jan 10, 2020 at 9:52 AM Hannah Jiang <ha...@google.com>
> wrote:
> >>
> >> >> This has a minor downside for the users who are using unreleased
> versions. They need to build a local image first before using docker to run.
> >> > Isn't that the current behavior?
> >>
> >> Our current behavior is pull & run. So in case both local and remote
> images are available, the local image is getting overwritten by the remote
> image.
> >> A New approach will do run only, which will pull remote images only
> when local images are not available. Since we don't deploy different images
> with the same tag, we can assume the images with the same tag are always
> identical, unless a user customized it with the same tag.
> >>
> >> This has the following advantages.
> >> 1. pull only when needed, so reduce unnecessary traffic for users.
> >> 2. In case a user customize the image and rebuild it with the default
> tag, the local customized image is used as expected. With pull & run,
> remote image, instead of the customized image, is used.
> >>
> >> On Thu, Jan 9, 2020 at 4:54 PM Kyle Weaver <kc...@google.com> wrote:
> >>>
> >>> > This has a minor downside for the users who are using unreleased
> versions. They need to build a local image first before using docker to run.
> >>>
> >>> Isn't that the current behavior?
> >>>
> >>> On Thu, Jan 9, 2020 at 4:48 PM Hannah Jiang <ha...@google.com>
> wrote:
> >>>>
> >>>> Hi Community
> >>>>
> >>>> Now we are using different default tags for Python(version or
> version.dev), Java(version-SNAPSHOT) and Go(latest). I would like to
> clean it up and make it consistent for all languages and here is my
> proposal.
> >>>>
> >>>> For the released version of SDKs, the default tag will be version
> number. (ex: 2.17.0)
> >>>> For the unreleased version of SDKs, the default tag will be version
> number + '.dev'. (ex: 2.18.0.dev)
> >>>>
> >>>> The default tag is used 1). when we build docker images without
> specifying a tag. 2) when we run a job with runners running on dockers with
> default docker images.
> >>>>
> >>>> Additionally, Beam will always lookup images locally before pulling
> one from remote, so the images built locally will not be overwritten by
> remote ones.
> >>>>
> >>>> This has a minor downside for the users who are using unreleased
> versions. They need to build a local image first before using docker to
> run. I will add a clear error message to show the problem and add a link to
> a documentation of how to create images.
> >>>>
> >>>> I would like to collect feedback from whoever uses dockers. Does this
> sound good? Is there anything I am missing?
> >>>>
> >>>> Thanks,
> >>>> Hannah
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
>

Re: Cleaning up SDK docker image tagging

Posted by Robert Bradshaw <ro...@google.com>.

On Fri, Jan 10, 2020 at 12:48 PM Kyle Weaver <kc...@google.com> wrote:
>
> > Shall we ALSO tag the image with git commit version for local build to keep track of obsolete images.
>
> This would mean we would have to be able to access the git commit from the source, which might not be trivial (right now the Beam version e.g. "2.18.0.dev" is hard-coded in some properties files). And the way it is now keeps things simple and easy to read.

It also means that as you're developing, you don't generate a long
trail of named containers that you'll never access again but are
harder to automatically prune.

> > we can assume the images with the same tag are always identical

This is only true if a developer never builds a container without
committing any local changes first.

Image tags are like git tags. They also have hashes (like commit ids)
if one wants to ensure one is pointing to the exact same thing.

> So far that's always been the case, but in case there are problems with the published container images and we have to update them, we want to make sure everyone gets the most up-to-date image [1].
>
> > 1. pull only when needed, so reduce unnecessary traffic for users.
>
> `docker pull` starts by checking if the local image is up-to-date with the remote, and most of the time it will be, so no more network usage beyond that is needed.
>
> > In case a user customize the image and rebuild it with the default tag
>
> The user should never need to build an image with the default release tag (e.g. 2.17.0). They will use either the .dev tag (the default) or even better, their own custom tag. (I suppose we can't stop users from manually tagging their own container with the release tag, but most people should know better.)

Does cloning a release, modifying the docker file, and building the
containers create a "new" container with a default release tag? If so,
we should discourage that:
https://beam.apache.org/documentation/runtime/environments/#modifying-dockerfiles

> > make it consistent for all languages
>
> Forgot to reply to this point -- I agree, +1.

Also +1

> [1] https://lists.apache.org/thread.html/7b5599f142785e616a1e943ff1c3da5213de370ed193373e01991bb6%40%3Cdev.beam.apache.org%3E
>
> On Fri, Jan 10, 2020 at 9:52 AM Hannah Jiang <ha...@google.com> wrote:
>>
>> >> This has a minor downside for the users who are using unreleased versions. They need to build a local image first before using docker to run.
>> > Isn't that the current behavior?
>>
>> Our current behavior is pull & run. So in case both local and remote images are available, the local image is getting overwritten by the remote image.
>> A New approach will do run only, which will pull remote images only when local images are not available. Since we don't deploy different images with the same tag, we can assume the images with the same tag are always identical, unless a user customized it with the same tag.
>>
>> This has the following advantages.
>> 1. pull only when needed, so reduce unnecessary traffic for users.
>> 2. In case a user customize the image and rebuild it with the default tag, the local customized image is used as expected. With pull & run, remote image, instead of the customized image, is used.
>>
>> On Thu, Jan 9, 2020 at 4:54 PM Kyle Weaver <kc...@google.com> wrote:
>>>
>>> > This has a minor downside for the users who are using unreleased versions. They need to build a local image first before using docker to run.
>>>
>>> Isn't that the current behavior?
>>>
>>> On Thu, Jan 9, 2020 at 4:48 PM Hannah Jiang <ha...@google.com> wrote:
>>>>
>>>> Hi Community
>>>>
>>>> Now we are using different default tags for Python(version or version.dev), Java(version-SNAPSHOT) and Go(latest). I would like to clean it up and make it consistent for all languages and here is my proposal.
>>>>
>>>> For the released version of SDKs, the default tag will be version number. (ex: 2.17.0)
>>>> For the unreleased version of SDKs, the default tag will be version number + '.dev'. (ex: 2.18.0.dev)
>>>>
>>>> The default tag is used 1). when we build docker images without specifying a tag. 2) when we run a job with runners running on dockers with default docker images.
>>>>
>>>> Additionally, Beam will always lookup images locally before pulling one from remote, so the images built locally will not be overwritten by remote ones.
>>>>
>>>> This has a minor downside for the users who are using unreleased versions. They need to build a local image first before using docker to run. I will add a clear error message to show the problem and add a link to a documentation of how to create images.
>>>>
>>>> I would like to collect feedback from whoever uses dockers. Does this sound good? Is there anything I am missing?
>>>>
>>>> Thanks,
>>>> Hannah
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>

Re: Cleaning up SDK docker image tagging

Posted by Hannah Jiang <ha...@google.com>.

Thanks for pointing me to the thread. I agree with what discussed there,
let's keep it as it is.
I will proceed with cleaning up tags only.

On Fri, Jan 10, 2020 at 12:48 PM Kyle Weaver <kc...@google.com> wrote:

> > Shall we ALSO tag the image with git commit version for local build to
> keep track of obsolete images.
>
> This would mean we would have to be able to access the git commit from the
> source, which might not be trivial (right now the Beam version e.g. "
> 2.18.0.dev" is hard-coded in some properties files). And the way it is
> now keeps things simple and easy to read.
>
> > we can assume the images with the same tag are always identical
>
> So far that's always been the case, but in case there are problems with
> the published container images and we have to update them, we want to make
> sure everyone gets the most up-to-date image [1].
>
> > 1. pull only when needed, so reduce unnecessary traffic for users.
>
> `docker pull` starts by checking if the local image is up-to-date with the
> remote, and most of the time it will be, so no more network usage beyond
> that is needed.
>
> > In case a user customize the image and rebuild it with the default tag
>
> The user should never need to build an image with the default release tag
> (e.g. 2.17.0). They will use either the .dev tag (the default) or even
> better, their own custom tag. (I suppose we can't stop users from manually
> tagging their own container with the release tag, but most people should
> know better.)
>
> > make it consistent for all languages
>
> Forgot to reply to this point -- I agree, +1.
>
> [1]
> https://lists.apache.org/thread.html/7b5599f142785e616a1e943ff1c3da5213de370ed193373e01991bb6%40%3Cdev.beam.apache.org%3E
>
> On Fri, Jan 10, 2020 at 9:52 AM Hannah Jiang <ha...@google.com>
> wrote:
>
>> >> This has a minor downside for the users who are using unreleased
>> versions. They need to build a local image first before using docker to run.
>> > Isn't that the current behavior?
>>
>> Our current behavior is pull & run. So in case both local and remote
>> images are available, the local image is getting overwritten by the remote
>> image.
>> A New approach will do run only, which will pull remote images only when
>> local images are not available. Since we don't deploy different images with
>> the same tag, we can assume the images with the same tag are always
>> identical, unless a user customized it with the same tag.
>>
>> This has the following advantages.
>> 1. pull only when needed, so reduce unnecessary traffic for users.
>> 2. In case a user customize the image and rebuild it with the default
>> tag, the local customized image is used as expected. With pull & run,
>> remote image, instead of the customized image, is used.
>>
>> On Thu, Jan 9, 2020 at 4:54 PM Kyle Weaver <kc...@google.com> wrote:
>>
>>> > This has a minor downside for the users who are using unreleased
>>> versions. They need to build a local image first before using docker to run.
>>>
>>> Isn't that the current behavior?
>>>
>>> On Thu, Jan 9, 2020 at 4:48 PM Hannah Jiang <ha...@google.com>
>>> wrote:
>>>
>>>> Hi Community
>>>>
>>>> Now we are using different default tags for Python(version or
>>>> version.dev), Java(version-SNAPSHOT) and Go(latest). I would like to
>>>> clean it up and make it consistent for all languages and here is my
>>>> proposal.
>>>>
>>>> For the released version of SDKs, the default tag will be version
>>>> number. (ex: 2.17.0)
>>>> For the unreleased version of SDKs, the default tag will be version
>>>> number + '.dev'. (ex: 2.18.0.dev)
>>>>
>>>> The default tag is used 1). when we build docker images without
>>>> specifying a tag. 2) when we run a job with runners running on dockers with
>>>> default docker images.
>>>>
>>>> Additionally, Beam will always lookup images locally before pulling one
>>>> from remote, so the images built locally will not be overwritten by remote
>>>> ones.
>>>>
>>>> This has a minor downside for the users who are using unreleased
>>>> versions. They need to build a local image first before using docker to
>>>> run. I will add a clear error message to show the problem and add a link to
>>>> a documentation of how to create images.
>>>>
>>>> I would like to collect feedback from whoever uses dockers. Does this
>>>> sound good? Is there anything I am missing?
>>>>
>>>> Thanks,
>>>> Hannah
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>

Re: Cleaning up SDK docker image tagging

Posted by Kyle Weaver <kc...@google.com>.

> Shall we ALSO tag the image with git commit version for local build to
keep track of obsolete images.

This would mean we would have to be able to access the git commit from the
source, which might not be trivial (right now the Beam version e.g. "
2.18.0.dev" is hard-coded in some properties files). And the way it is now
keeps things simple and easy to read.

> we can assume the images with the same tag are always identical

So far that's always been the case, but in case there are problems with the
published container images and we have to update them, we want to make sure
everyone gets the most up-to-date image [1].

> 1. pull only when needed, so reduce unnecessary traffic for users.

`docker pull` starts by checking if the local image is up-to-date with the
remote, and most of the time it will be, so no more network usage beyond
that is needed.

> In case a user customize the image and rebuild it with the default tag

The user should never need to build an image with the default release tag
(e.g. 2.17.0). They will use either the .dev tag (the default) or even
better, their own custom tag. (I suppose we can't stop users from manually
tagging their own container with the release tag, but most people should
know better.)

> make it consistent for all languages

Forgot to reply to this point -- I agree, +1.

[1]
https://lists.apache.org/thread.html/7b5599f142785e616a1e943ff1c3da5213de370ed193373e01991bb6%40%3Cdev.beam.apache.org%3E

On Fri, Jan 10, 2020 at 9:52 AM Hannah Jiang <ha...@google.com> wrote:

> >> This has a minor downside for the users who are using unreleased
> versions. They need to build a local image first before using docker to run.
> > Isn't that the current behavior?
>
> Our current behavior is pull & run. So in case both local and remote
> images are available, the local image is getting overwritten by the remote
> image.
> A New approach will do run only, which will pull remote images only when
> local images are not available. Since we don't deploy different images with
> the same tag, we can assume the images with the same tag are always
> identical, unless a user customized it with the same tag.
>
> This has the following advantages.
> 1. pull only when needed, so reduce unnecessary traffic for users.
> 2. In case a user customize the image and rebuild it with the default tag,
> the local customized image is used as expected. With pull & run, remote
> image, instead of the customized image, is used.
>
> On Thu, Jan 9, 2020 at 4:54 PM Kyle Weaver <kc...@google.com> wrote:
>
>> > This has a minor downside for the users who are using unreleased
>> versions. They need to build a local image first before using docker to run.
>>
>> Isn't that the current behavior?
>>
>> On Thu, Jan 9, 2020 at 4:48 PM Hannah Jiang <ha...@google.com>
>> wrote:
>>
>>> Hi Community
>>>
>>> Now we are using different default tags for Python(version or
>>> version.dev), Java(version-SNAPSHOT) and Go(latest). I would like to
>>> clean it up and make it consistent for all languages and here is my
>>> proposal.
>>>
>>> For the released version of SDKs, the default tag will be version
>>> number. (ex: 2.17.0)
>>> For the unreleased version of SDKs, the default tag will be version
>>> number + '.dev'. (ex: 2.18.0.dev)
>>>
>>> The default tag is used 1). when we build docker images without
>>> specifying a tag. 2) when we run a job with runners running on dockers with
>>> default docker images.
>>>
>>> Additionally, Beam will always lookup images locally before pulling one
>>> from remote, so the images built locally will not be overwritten by remote
>>> ones.
>>>
>>> This has a minor downside for the users who are using unreleased
>>> versions. They need to build a local image first before using docker to
>>> run. I will add a clear error message to show the problem and add a link to
>>> a documentation of how to create images.
>>>
>>> I would like to collect feedback from whoever uses dockers. Does this
>>> sound good? Is there anything I am missing?
>>>
>>> Thanks,
>>> Hannah
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>

Re: Cleaning up SDK docker image tagging

Posted by Valentyn Tymofieiev <va...@google.com>.

Hi Hannah,

+1 to standardize .dev suffixes across all SDKs.

Whether to pull or not to pull was recently discussed in [1]. My personal
preference would be to pull images before starting the containers, and
instructing users who want to customize containers to tag them with a new
tag, such as :customized-2.17.0. If we need to revisit this conversation,
consider continuing it in [1] to keep the conversation in one place.

[1]
https://lists.apache.org/thread.html/07131e314e229ec60100eaa2c0cf6dfc206bf2b0f78c3cee9ebb0bda@%3Cdev.beam.apache.org%3E

Thanks,
Valentyn

On Fri, Jan 10, 2020 at 9:52 AM Hannah Jiang <ha...@google.com> wrote:

> >> This has a minor downside for the users who are using unreleased
> versions. They need to build a local image first before using docker to run.
> > Isn't that the current behavior?
>
> Our current behavior is pull & run. So in case both local and remote
> images are available, the local image is getting overwritten by the remote
> image.
> A New approach will do run only, which will pull remote images only when
> local images are not available. Since we don't deploy different images with
> the same tag, we can assume the images with the same tag are always
> identical, unless a user customized it with the same tag.
>
> This has the following advantages.
> 1. pull only when needed, so reduce unnecessary traffic for users.
>
2. In case a user customize the image and rebuild it with the default tag,
> the local customized image is used as expected. With pull & run, remote
> image, instead of the customized image, is used.
>
> On Thu, Jan 9, 2020 at 4:54 PM Kyle Weaver <kc...@google.com> wrote:
>
>> > This has a minor downside for the users who are using unreleased
>> versions. They need to build a local image first before using docker to run.
>>
>> Isn't that the current behavior?
>>
>> On Thu, Jan 9, 2020 at 4:48 PM Hannah Jiang <ha...@google.com>
>> wrote:
>>
>>> Hi Community
>>>
>>> Now we are using different default tags for Python(version or
>>> version.dev), Java(version-SNAPSHOT) and Go(latest). I would like to
>>> clean it up and make it consistent for all languages and here is my
>>> proposal.
>>>
>>> For the released version of SDKs, the default tag will be version
>>> number. (ex: 2.17.0)
>>> For the unreleased version of SDKs, the default tag will be version
>>> number + '.dev'. (ex: 2.18.0.dev)
>>>
>>> The default tag is used 1). when we build docker images without
>>> specifying a tag. 2) when we run a job with runners running on dockers with
>>> default docker images.
>>>
>>> Additionally, Beam will always lookup images locally before pulling one
>>> from remote, so the images built locally will not be overwritten by remote
>>> ones.
>>>
>>> This has a minor downside for the users who are using unreleased
>>> versions. They need to build a local image first before using docker to
>>> run. I will add a clear error message to show the problem and add a link to
>>> a documentation of how to create images.
>>>
>>> I would like to collect feedback from whoever uses dockers. Does this
>>> sound good? Is there anything I am missing?
>>>
>>> Thanks,
>>> Hannah
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>

Re: Cleaning up SDK docker image tagging

Posted by Hannah Jiang <ha...@google.com>.

>> This has a minor downside for the users who are using unreleased
versions. They need to build a local image first before using docker to run.
> Isn't that the current behavior?

Our current behavior is pull & run. So in case both local and remote images
are available, the local image is getting overwritten by the remote image.
A New approach will do run only, which will pull remote images only when
local images are not available. Since we don't deploy different images with
the same tag, we can assume the images with the same tag are always
identical, unless a user customized it with the same tag.

This has the following advantages.
1. pull only when needed, so reduce unnecessary traffic for users.
2. In case a user customize the image and rebuild it with the default tag,
the local customized image is used as expected. With pull & run, remote
image, instead of the customized image, is used.

On Thu, Jan 9, 2020 at 4:54 PM Kyle Weaver <kc...@google.com> wrote:

> > This has a minor downside for the users who are using unreleased
> versions. They need to build a local image first before using docker to run.
>
> Isn't that the current behavior?
>
> On Thu, Jan 9, 2020 at 4:48 PM Hannah Jiang <ha...@google.com>
> wrote:
>
>> Hi Community
>>
>> Now we are using different default tags for Python(version or version.dev),
>> Java(version-SNAPSHOT) and Go(latest). I would like to clean it up and make
>> it consistent for all languages and here is my proposal.
>>
>> For the released version of SDKs, the default tag will be version number.
>> (ex: 2.17.0)
>> For the unreleased version of SDKs, the default tag will be version
>> number + '.dev'. (ex: 2.18.0.dev)
>>
>> The default tag is used 1). when we build docker images without
>> specifying a tag. 2) when we run a job with runners running on dockers with
>> default docker images.
>>
>> Additionally, Beam will always lookup images locally before pulling one
>> from remote, so the images built locally will not be overwritten by remote
>> ones.
>>
>> This has a minor downside for the users who are using unreleased
>> versions. They need to build a local image first before using docker to
>> run. I will add a clear error message to show the problem and add a link to
>> a documentation of how to create images.
>>
>> I would like to collect feedback from whoever uses dockers. Does this
>> sound good? Is there anything I am missing?
>>
>> Thanks,
>> Hannah
>>
>>
>>
>>
>>
>>
>>
>>

Re: Cleaning up SDK docker image tagging

Posted by Hannah Jiang <ha...@google.com>.

> For the unreleased version of SDKs, the default tag will be version
number + '.dev'. (ex: 2.18.0.dev)
>> Shall we ALSO tag the image with git commit version for local build to
keep track of obsolete images.

I should clarify it more clearly.
This is about release images. The dev images are only available locally
when a developer builds it from a dev version.
Release images are deployed to docker hub each time when we release a new
version of Beam.

Deploying snapshot images are on our to-do list, it will be tackled
separately later.

On Thu, Jan 9, 2020 at 5:09 PM Ankur Goenka <go...@google.com> wrote:

> >> For the released version of SDKs, the default tag will be version
> number. (ex: 2.17.0)
> +1
>
> >> For the unreleased version of SDKs, the default tag will be version
> number + '.dev'. (ex: 2.18.0.dev)
> Shall we ALSO tag the image with git commit version for local build to
> keep track of obsolete images.
>
> On Thu, Jan 9, 2020 at 4:54 PM Kyle Weaver <kc...@google.com> wrote:
>
>> > This has a minor downside for the users who are using unreleased
>> versions. They need to build a local image first before using docker to run.
>>
>> Isn't that the current behavior?
>>
>> On Thu, Jan 9, 2020 at 4:48 PM Hannah Jiang <ha...@google.com>
>> wrote:
>>
>>> Hi Community
>>>
>>> Now we are using different default tags for Python(version or
>>> version.dev), Java(version-SNAPSHOT) and Go(latest). I would like to
>>> clean it up and make it consistent for all languages and here is my
>>> proposal.
>>>
>>> For the released version of SDKs, the default tag will be version
>>> number. (ex: 2.17.0)
>>> For the unreleased version of SDKs, the default tag will be version
>>> number + '.dev'. (ex: 2.18.0.dev)
>>>
>>> The default tag is used 1). when we build docker images without
>>> specifying a tag. 2) when we run a job with runners running on dockers with
>>> default docker images.
>>>
>>> Additionally, Beam will always lookup images locally before pulling one
>>> from remote, so the images built locally will not be overwritten by remote
>>> ones.
>>>
>>> This has a minor downside for the users who are using unreleased
>>> versions. They need to build a local image first before using docker to
>>> run. I will add a clear error message to show the problem and add a link to
>>> a documentation of how to create images.
>>>
>>> I would like to collect feedback from whoever uses dockers. Does this
>>> sound good? Is there anything I am missing?
>>>
>>> Thanks,
>>> Hannah
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>

Re: Cleaning up SDK docker image tagging

Posted by Ankur Goenka <go...@google.com>.

>> For the released version of SDKs, the default tag will be version
number. (ex: 2.17.0)
+1

>> For the unreleased version of SDKs, the default tag will be version
number + '.dev'. (ex: 2.18.0.dev)
Shall we ALSO tag the image with git commit version for local build to keep
track of obsolete images.

On Thu, Jan 9, 2020 at 4:54 PM Kyle Weaver <kc...@google.com> wrote:

> > This has a minor downside for the users who are using unreleased
> versions. They need to build a local image first before using docker to run.
>
> Isn't that the current behavior?
>
> On Thu, Jan 9, 2020 at 4:48 PM Hannah Jiang <ha...@google.com>
> wrote:
>
>> Hi Community
>>
>> Now we are using different default tags for Python(version or version.dev),
>> Java(version-SNAPSHOT) and Go(latest). I would like to clean it up and make
>> it consistent for all languages and here is my proposal.
>>
>> For the released version of SDKs, the default tag will be version number.
>> (ex: 2.17.0)
>> For the unreleased version of SDKs, the default tag will be version
>> number + '.dev'. (ex: 2.18.0.dev)
>>
>> The default tag is used 1). when we build docker images without
>> specifying a tag. 2) when we run a job with runners running on dockers with
>> default docker images.
>>
>> Additionally, Beam will always lookup images locally before pulling one
>> from remote, so the images built locally will not be overwritten by remote
>> ones.
>>
>> This has a minor downside for the users who are using unreleased
>> versions. They need to build a local image first before using docker to
>> run. I will add a clear error message to show the problem and add a link to
>> a documentation of how to create images.
>>
>> I would like to collect feedback from whoever uses dockers. Does this
>> sound good? Is there anything I am missing?
>>
>> Thanks,
>> Hannah
>>
>>
>>
>>
>>
>>
>>
>>

Re: Cleaning up SDK docker image tagging

Posted by Kyle Weaver <kc...@google.com>.

> This has a minor downside for the users who are using unreleased
versions. They need to build a local image first before using docker to run.

Isn't that the current behavior?

On Thu, Jan 9, 2020 at 4:48 PM Hannah Jiang <ha...@google.com> wrote:

> Hi Community
>
> Now we are using different default tags for Python(version or version.dev),
> Java(version-SNAPSHOT) and Go(latest). I would like to clean it up and make
> it consistent for all languages and here is my proposal.
>
> For the released version of SDKs, the default tag will be version number.
> (ex: 2.17.0)
> For the unreleased version of SDKs, the default tag will be version number
> + '.dev'. (ex: 2.18.0.dev)
>
> The default tag is used 1). when we build docker images without specifying
> a tag. 2) when we run a job with runners running on dockers with default
> docker images.
>
> Additionally, Beam will always lookup images locally before pulling one
> from remote, so the images built locally will not be overwritten by remote
> ones.
>
> This has a minor downside for the users who are using unreleased versions.
> They need to build a local image first before using docker to run. I will
> add a clear error message to show the problem and add a link to a
> documentation of how to create images.
>
> I would like to collect feedback from whoever uses dockers. Does this
> sound good? Is there anything I am missing?
>
> Thanks,
> Hannah
>
>
>
>
>
>
>
>