You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Stephen Sisk (JIRA)" <ji...@apache.org> on 2017/04/20 16:21:04 UTC

[jira] [Commented] (BEAM-1878) IO ITs: how to handle custom docker images?

    [ https://issues.apache.org/jira/browse/BEAM-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977002#comment-15977002 ] 

Stephen Sisk commented on BEAM-1878:
------------------------------------

current status:
There has been further discussion, but no resolution on the mailing list.

It looks like we *might* be able to get away always using pre-created images hosted elsewhere (the example where we weren't sure ended up not needing a custom image)

It sounds like folks are pushing us more towards a solution where we either host our own images or push them to docker hub.

Unassigning since I'm not actively working on this.

> IO ITs: how to handle custom docker images?
> -------------------------------------------
>
>                 Key: BEAM-1878
>                 URL: https://issues.apache.org/jira/browse/BEAM-1878
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Stephen Sisk
>            Assignee: Stephen Sisk
>
> Summary:
> For IO ITs that use data stores that need custom docker images in order to
> run, we can't currently use them in a kubernetes cluster (which is where we
> host our data stores.) I have a couple options for how to solve this and am
> looking for feedback from folks involved in creating IO ITs/opinions on
> kubernetes.
> Details:
> We've discussed in the past that we'll want to allow developers to submit
> just a dockerfile, and then we'll use that when creating the data store on
> kubernetes. This is the case for ElasticsearchIO and I assume more data
> stores in the future will want to do this. It's also looking like it'll be
> necessary to use custom docker images for the HadoopInputFormatIO's
> cassandra ITs - to run a cassandra cluster, there doesn't seem to be a good
> image you can use out of the box.
> In either case, in order to retrieve a docker image, kubernetes needs a
> container registry - it will read the docker images from there. A simple
> private container registry doesn't work because kubernetes config files are
> static - this means that if local devs try to use the kubernetes files,
> they point at the private container registry and they wouldn't be able to
> retrieve the images since they don't have access. They'd have to manually
> edit the files, which in theory is an option, but I don't consider that to
> be acceptable since it feels pretty unfriendly (it is simple, so if we
> really don't like the below options we can revisit it.)
> Quick summary of the options
> =======================
> We can:
> * Start using something like k8 helm - this adds more dependencies, adds a
> small amount of complexity (this is my recommendation, but only by a little)
> * Start pushing images to docker hub - this means they'll be publicly
> visible and raises the bar for maintenance of those images
> * Host our own public container registry - this means running our own
> public service with costs, etc..
> I discussed the options in detail in my original email to dev@:
> https://lists.apache.org/thread.html/ca53c338209a2120d710e2e775fce384c6b68dd7f207a807efa2534b@%3Cdev.beam.apache.org%3E
> I ran into this
> question while working on getting the HIFIO cassandra cluster running, so I
> might prototype with that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)