You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@druid.apache.org by Don Bowman <do...@agilicus.com> on 2019/01/16 22:23:18 UTC

Docker image

Is anyone working on a docker image? I mean, there are quite a few out
there but they have some various issues, usually security based as they
inherit from non-too-strong bases.

I have done one w/ gcr.io/distroless/java as the parent, and it seems
working, but not sure if there is a reason or strategy for not having one
in the repo and built by travis to dockerhub.

Some of us would like to be deploying via helm in kubernetes and this is
causing it to be a bit complex.

Re: Docker image

Posted by Don Bowman <do...@agilicus.com>.
On Fri, 18 Jan 2019 at 08:19, Don Bowman <do...@agilicus.com> wrote:

>
> So I have a Dockerfile which i think will work well. Before I commit and
> create the PR can I ask for some direction?
> Would we use this as
>
> a: the Dockerfile is in the source tree, and does a 'COPY .' into the
> container to build? This is what I did for
> https://github.com/fluent/fluent-bit. It makes it very simple to build
> the docker container as a developer, but also as part of the travis
> pipeline or the dockerhub pipeline.... This would mean replacing the 'git
> clone/checkout' in the below.
> b: The Dockerfile is standalone, it does a clone of the repo (as below).
> This is not as convenient for a dev (you need to have pushed) nor for the
> other pipelines, but it does allow for a single point in time release.
>
> For the 'config', what i have gone for is any druid_XXX env variable is
> converted to druid.xxx config in _common/properties or service/properties.
>
>
>
I have created a PR (https://github.com/apache/incubator-druid/pull/6896) using
option 'a' above. Comments welcome!.

Re: Docker image

Posted by Don Bowman <do...@agilicus.com>.
So I have a Dockerfile which i think will work well. Before I commit and
create the PR can I ask for some direction?
Would we use this as

a: the Dockerfile is in the source tree, and does a 'COPY .' into the
container to build? This is what I did for
https://github.com/fluent/fluent-bit. It makes it very simple to build the
docker container as a developer, but also as part of the travis pipeline or
the dockerhub pipeline.... This would mean replacing the 'git
clone/checkout' in the below.
b: The Dockerfile is standalone, it does a clone of the repo (as below).
This is not as convenient for a dev (you need to have pushed) nor for the
other pipelines, but it does allow for a single point in time release.

For the 'config', what i have gone for is any druid_XXX env variable is
converted to druid.xxx config in _common/properties or service/properties.

I have created a 'batteries included' image with all the extensions in
place and the mysql connector. Its a bit big but all Java containers are
hefty. This has all services in it, but it runs whichever is on the command
line (e.g. docker run druid middleManager).

I've also forced the 'forked subprocesses' in middleManager to log to
stdout as is convention in containers. Its not convention to fork big
things like that, but that's another story.

Comments on a: vs b: ? Comments on other things to consider? I'll commit
and send a pull req for comment next week.

FROM maven:3-jdk-8 as builder

ARG DRUID_VERSION=0.13.0-incubating

RUN mkdir -p /src \
 && cd /src \
 && git clone https://github.com/apache/incubator-druid \
 && cd incubator-druid \
 && git checkout tags/druid-${DRUID_VERSION} \
 && mvn install -ff -DskipTests -Dforbiddenapis.skip=true -Pdist
-Pbundle-contrib-exts \
 && tar -zxf ./distribution/target/apache-druid-${DRUID_VERSION}-bin.tar.gz
-C /opt \
 && ln -s /opt/apache-druid-$DRUID_VERSION /opt/druid

COPY sha256sums.txt /tmp
RUN cd /opt/druid/extensions/mysql-metadata-storage \
 && wget -O mysql-connector-java-5.1.38.jar
http://central.maven.org/maven2/mysql/mysql-connector-java/5.1.38/mysql-connector-java-5.1.38.jar
\
 && sha256sum --ignore-missing -c /tmp/sha256sums.txt

RUN addgroup --gid 1000 druid \
 && adduser --home /opt/druid --shell /bin/sh --no-create-home --uid 1000
--gecos '' --gid 1000 --disabled-password druid \
 && mkdir -p /opt/druid/var \
 && chown -R druid:druid /opt/druid

# Use :debug to get a busybox shell so that scripts can run
FROM gcr.io/distroless/java:debug
MAINTAINER Don Bowman <do...@agilicus.com>

RUN ["/busybox/busybox", "--install", "/bin"]
COPY --from=builder /etc/passwd /etc/passwd
COPY --from=builder /etc/group /etc/group
COPY --from=builder --chown=druid /opt /opt
COPY druid.sh /druid.sh
USER druid
VOLUME /opt/druid/var
WORKDIR /opt/druid

ENTRYPOINT ["/druid.sh"]

Re: Docker image

Posted by Don Bowman <do...@agilicus.com>.
expectation? that I would never see maven :)

so given how the plugins work, it would either need to be
batteries-included (which is what I did), or would need to be a
'small'/'medium'/'large' w/ various sets of them.

given that i'm deploying in k8s, I would want the have the GCS/S3/Azure
blob storage extensions available.

I would expect the travis pipeline would build and publish to dockerhub
**or** the dockerhub build pipeline would see the tag push to github and
build automatically/label.

a single image is prob fine, but it could be druid:broker, druid:ingest, ..
etc.

env vars to configure is OK, but better would be a configmap that it would
reload via inotify when changed.

i would expect the container to have a single process. that it would have
100% of the content from signed sources.

mine works from distroless, i'll post it for interest. i had to build since
the released bundle doesn't have the contrib extensions.

I can do a proper one and send via PR as long as I understand what people
want/don't want.




On Wed, 16 Jan 2019 at 17:43, Charles Allen <ch...@snap.com.invalid>
wrote:

> The idea has been toyed around with internally here. What would your
> expectations be of such an image?
>
>
> On Wed, Jan 16, 2019 at 2:35 PM Don Bowman <do...@agilicus.com> wrote:
>
> > Is anyone working on a docker image? I mean, there are quite a few out
> > there but they have some various issues, usually security based as they
> > inherit from non-too-strong bases.
> >
> > I have done one w/ gcr.io/distroless/java as the parent, and it seems
> > working, but not sure if there is a reason or strategy for not having one
> > in the repo and built by travis to dockerhub.
> >
> > Some of us would like to be deploying via helm in kubernetes and this is
> > causing it to be a bit complex.
> >
>

Re: Docker image

Posted by Charles Allen <ch...@snap.com.INVALID>.
The idea has been toyed around with internally here. What would your
expectations be of such an image?


On Wed, Jan 16, 2019 at 2:35 PM Don Bowman <do...@agilicus.com> wrote:

> Is anyone working on a docker image? I mean, there are quite a few out
> there but they have some various issues, usually security based as they
> inherit from non-too-strong bases.
>
> I have done one w/ gcr.io/distroless/java as the parent, and it seems
> working, but not sure if there is a reason or strategy for not having one
> in the repo and built by travis to dockerhub.
>
> Some of us would like to be deploying via helm in kubernetes and this is
> causing it to be a bit complex.
>