You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ofbiz.apache.org by Daniel Watford <da...@foomoo.co.uk> on 2023/03/09 11:34:39 UTC

Update on experimenting with docker deployments of OFBiz

Hello all,

The following is an update and proposed next steps for the experiment with
docker deployments to the Demo VM.

Jira ticket: [OFBIZ-12757] Experiment with deploying OFBiz containers to
the demo sites server - ASF JIRA (apache.org)


What was proposed:
1. Create a branch in the ofbiz-framework repository from trunk where
Dockerfiles and other changes to the build process needed to produce Docker
containers for demo sites can be implemented.
2. Create one or more CNAME DNS entries, pointing to ofbiz-vm1.apache.org,
which can be used to access any experimental container-based demo site
instances.
3. Deploy Traefik as a reverse proxy to access the experimental demo sites,
assuming this approach does not conflict with any reverse proxy solution
already in place on the VM - hence the need to gain access and explore the
current server configuration.
4. Create build configuration(s) for the new branch to build the
ofbiz-framework as a container image with pre-loaded demo data, and push
the container image to an appropriate container repository.
5. Implement scripts at the VM to pull and deploy the latest version of the
demo container image daily.


What actually happened:
1. The ofbiz-framework branch was created -
https://github.com/apache/ofbiz-framework/tree/experimental-docker
2. New CNAME entries created for exp1.ofbiz.apache.org,
exp2.ofbiz.apache.org and exp3.ofbiz.apache.org.
3. Traefik was not used. Instead additional vhosts were added to the
existing Apache HTTPD reverse proxy already in place on the host. By adding
the new domains to the Puppet configuration for this host, we benefited
from the automatic management of LetsEncrypt certificates.
4. Multiple variants of container images are being produced in a GitHub
Actions workflow and are being pushed to the GitHub Packages container
repository (ghcr.io/apache/ofbiz)
5. Cron jobs, docker compose application configurations and other scripts
for the ofbizdocker user can be found in the docker-experimental branch of
the ofbiz-tools repository, here -
https://github.com/apache/ofbiz-tools/tree/docker-experimental/demo-backup/ofbizdocker

Work on this experiment was tracked in OFBIZ-12757.


Difficulties experienced along the way:

The biggest problem we hit was a capacity issue on the OFBiz project's VM,
ofbiz-vm1.apache.org.

A good rule of thumb seemed to be that an OFBiz demo instance consumed
around 2GB of RAM. Judicious use of memory and cpu limits on the containers
to be deployed on ofbiz-vm1 mean we could squeeze a container instance of
OFBiz alongside the existing 3 demo sites and still fit into the total 8GB
of RAM available on the host.

Everything seemed to be going well… until!  An OFBiz demo instance would
consume around 2GB of RAM, but to build that instance would take much more
- perhaps another 2GB. OFBiz demo instances are rebuilt daily at 03:00 UTC
and suddenly these rebuild processes - which would normally complete in a
few minutes - were taking hours to run. The system was under such heavy
load, presumably from all the garbage collecting and swapping, that it was
unresponsive to web requests and SSH connections.

On Friday 10th February INFRA-24185 was raised, requesting an additional
4GB of RAM for ofbiz-vm1. INFRA approved the request and plans were made to
shut down the VM to perform the upgrade the following Monday. However the
03:00 UTC scheduled rebuild didn't agree with the plan!

The next day (Saturday 11th February), due to the 03:00 rebuild, the VM was
unresponsive again (
https://lists.apache.org/thread/dh0dp8f3n8jyy8n3jbho7vsghy42f41f). INFRA
shut down the VM for us in response and performed the upgrade ahead of
schedule. They upgraded us from a 2vCPU 8GB VM to 8vCPU 32GB VM. A 4-times
increase in compute and memory - an unexpected and very much appreciated
upgrade!


Initial approach to container build and deployments:

With the new capacity on our demo VM, I created the configuration necessary
to build and deploy the first OFBiz VMs.

Each day at 02:35 UTC, a process running as the ofbizdocker user performed
a git-pull of the experimental-docker branch, followed by docker-build and
docker-push commands, adding updated container images to the local
container repository on ofbiz-vm1.apache.org.

The nightly script would then shutdown the docker-compose applications
behind exp1.ofbiz.apache.org and exp2.ofbiz.apache.org, pull their latest
images and restart the applications.

Exp1.ofbiz.apache.org and exp2.ofbiz.apache.org were successfully running
OFBiz in containers, accessed via the Apache HTTPd reverse proxy.

I was even able to connect a debugger to OFBiz by exposing port 5005 on one
of the containers to the VM's localhost interface, and then opening an SSH
tunnel between that localhost port and my development host. IntelliJ
happily opened a remote debug session via the local end of the SSH tunnel
and I was able to step through Groovy and Java code.

The experiment had been successful so far and met its main objectives, but
I wasn't really happy having the demo VM building container images,
preferring instead to have the VM pull ready-built images from some other
location. This meant I would need to figure out how to build and host the
images.


How to build containers images:

From reading the INFRA services pages, it seems our main options for
building software are Buildbot and GitHub Actions.

Buildbot is already used by the OFBiz project, so adding to the build
scripts should be a reasonable approach. However I couldn't find any
documentation about the availability of docker on the Buildbot worker
nodes, nor could I find examples of other projects using Buildbot to create
container images.

I also didn't have any experience dealing with Buildbot and was concerned
about the effort needed to get up to speed, whereas I did have experience
with GitHub Actions.

There was a wealth of information available on using GitHub Actions to
build container images - with much information coming from the Apache
Airflow project's experiences. However various confluence pages suggested
that GitHub Actions should be avoided for the time being due to capacity
issues that the Apache Foundation's projects were experiencing which led to
excessive delays on builds.  (See
https://infra.apache.org/github-actions-secrets.html and
https://cwiki.apache.org/confluence/display/BUILDS/GitHub+Actions+status)

While trying to decide which approach to take, a chance Slack conversation
on #asfinfra revealed that the capacity issues had been resolved for quite
some time and that there shouldn't be any blockers to using GitHub Actions
(GH Actions).

GH Actions has actions for building and pushing docker images out of the
box. It felt like it had a lower-barrier-to-entry.


Where to publish container images:

Apache has an account on DockerHub where images could be pushed.
Credentials would need to be configured for our build tools - in this case
GH Actions. However the organisation's account on GitHub also gives us GH
Packages.

GH Actions is well integrated with GH Packages, and pushing container
images there turned out to be an easy process - no INFRA requests for
configuration were necessary.


What is being built:

A new GH Actions workflow has been committed to the experimental-docker
branch. This workflow does the following in response to a commit or tag
push on the experimental-docker branch only:
- Checkout ofbiz-framework sources
- Build the ofbiz-framework container image with no data preloaded
- Build the ofbiz-framework container image with demo data loaded to the
embedded Derby database.
- Run the gradle task to retrieve plugin sources
- Build the ofbiz-framwork + plugins with no data preloaded.

The 3 built container images are then tagged according to the branch name
and SHA of the commit that triggered the GH Workflow:
- experimental-docker-branch-snapshot: Always refers to the latest built
snapshot of the experimental-docker branch, without any data preloaded.
This tag will be 'moved' to newer builds as they occur.
- experimental-docker-branch-{{sha}}: As above, but this tag is not
reassigned to later builds.
- experimental-docker-branch-preloaddemo-snapshot: Always refers to the
latest built snapshot of the experimental-docker branch, with demo data
preloaded. This tag will be 'moved' to newer builds as they occur.
- experimental-docker-branch-preload-{{sha}}: As above, but this tag is not
reassigned to later builds.
- experimental-docker-branch-plugins-snapshot: Always refers to the latest
built snapshot of the experimental-docker branch + plugins, without any
data preloaded. This tag will be 'moved' to newer builds as they occur.
- experimental-docker-branch-plugins-{{sha}}: As above, but this tag is not
reassigned to later builds.

The container tags are very verbose at the moment, but I wanted to stress
to anyone reading them that they do not correspond to releases, hence
including the '-branch-' text in each tag name. Perhaps it would be
acceptable to just use 'snapshot' and not use '-branch-' or the {{sha}} in
the tag names.

Feedback on appropriate container tag naming conventions is very welcome.



What is currently deployed:

The 3 container images built and published from the experimental-docker
branch are pulled and deployed to the demo VM each day at 02:35 UTC each
day:
- https://exp1.ofbiz.apache.org/partymgr - OFBiz Framework, deployed with
demo data.
- https://exp2.ofbiz.apache.org/partymgr - OFBiz Framework, deployed with
seed data.
- https://exp3.ofbiz.apache.org/partymgr - OFBiz Framework + Plugins,
deployed with demo data.



The above is hopefully a reasonable summary of activity for the docker
experiment. To get that live-action-replay feeling, I invite you to scroll
back to February 10th in the #ofbiz Slack channel, and the comments in
OFBIZ-12757, to see how things unfolded as myself, Jacques, Eugen and
Michael dealt with the various issues with ofbiz-vm1.apache.org.



Proposed next steps:

I would like to merge the experimental-docker branch into trunk and modify
the GitHub Actions workflow to build
snapshot docker container images in response to pushes on the branch.

Container images would be published to ghcr.io/apache/ofbiz
Variants of the container images built in response to git commits are:
- the ofbiz-framework without any data loaded. Tags: trunk-branch-snapshot
/ trunk-branch-{{sha}}
- the ofbiz-framework, preloaded with demo data using the embedded Derby
database. Tags: trunk-branch-preloaddemo-snapshot /
trunk-branch-preloaddemo-{{sha}}
- the ofbiz-framework plus plugins, without any data loaded. Tags:
trunk-branch-plugins-snapshot / trunk-branch-plugins-{{sha}}

The above container image tag names are subject to change based on feedback
received.

I would then like to replace the current approach to deploying demo-trunk
with a docker container, using the image variant
ofbiz-framework-plugins-snapshot. Using docker for this demo will work
around the issue of 'ofbiz --shutdown' not consistently working for
demo-trunk (OFBIZ-10287).

With the above in place I would propose waiting a week or two to address
any issues that might occur and to hopefully confirm that trunk snapshot
container images are consistently built and pushed in response to trunk
commits. If the container images and demo-trunk site work successfully then
I propose we apply similar changes to the release22.01 branch so that we
can offer users a quick-start based on docker deployments. We must
highlight, though, that container images are provided only as a
convenience, and that the official project release is the sources.

(GitHub Packages do provide counts of downloads so we might see some
interesting numbers:
https://github.com/apache/ofbiz-framework/pkgs/container/ofbiz)

If building container images for commits on the release22.01 branch,
proposed variants would follow the same pattern as for trunk commits, using
tags:
- release22.01-branch-snapshot / release22.01-branch-{{sha}}
- release22.01-branch-preloaddemo-snapshot /
release22.01-branch-preloaddemo-{{sha}}
- release22.01-branch-plugins-snapshot / release22.01-branch-plugins-{{sha}}

Further, we would also build container images in response to git tags being
pushed to the release22.01 branch. The container image tags for these would
be based on the git tag. For example, if pushing tag 'release22.01.03', the
container image tags would be:
- 22.01.03
- 22.01.03-preloaddemo
22.01.03-plugins


Please could the community respond to the proposed next steps as a possible
way forward to using docker deployments on our demo site AND as a
convenience for users to try out OFBiz.

Thanks,

Dan.
-- 
Daniel Watford

Re: Update on experimenting with docker deployments of OFBiz

Posted by Jacques Le Roux <ja...@les7arts.com>.
Le 09/03/2023 à 16:57, Daniel Watford a écrit :
> No change is needed, but to prevent confusion we can map uid 1000 from
> within the OFBiz containers to an appropriate user present on the VM. If we
> go ahead I will request creation of a new user with minimal permissions to
> be mapped to.
That sounds like a plan, TIA

Re: Update on experimenting with docker deployments of OFBiz

Posted by Daniel Watford <da...@foomoo.co.uk>.
Hi Jacques,

The issue about the apparent process ownership by brianb is due to UID
(user id) 1000 getting resolved to a name.

Since brianb doesn't exist in /etc/passwd it looks like a component of the
OS or a system library is resolving the name using LDAP. It hasn't been
confirmed, but #asfinfra mentioned that Brian was the original foundation
sysadmin so likely had UID 1000.

No change is needed, but to prevent confusion we can map uid 1000 from
within the OFBiz containers to an appropriate user present on the VM. If we
go ahead I will request creation of a new user with minimal permissions to
be mapped to.

Thanks,

Dan.

On Thu, 9 Mar 2023 at 14:22, Jacques Le Roux <ja...@les7arts.com>
wrote:

> Le 09/03/2023 à 12:34, Daniel Watford a écrit :
> > Difficulties experienced along the way:
> >
> > The biggest problem we hit was a capacity issue on the OFBiz project's
> VM,
> > ofbiz-vm1.apache.org.
> >
> > A good rule of thumb seemed to be that an OFBiz demo instance consumed
> > around 2GB of RAM. Judicious use of memory and cpu limits on the
> containers
> > to be deployed on ofbiz-vm1 mean we could squeeze a container instance of
> > OFBiz alongside the existing 3 demo sites and still fit into the total
> 8GB
> > of RAM available on the host.
> >
> > Everything seemed to be going well… until!  An OFBiz demo instance would
> > consume around 2GB of RAM, but to build that instance would take much
> more
> > - perhaps another 2GB. OFBiz demo instances are rebuilt daily at 03:00
> UTC
> > and suddenly these rebuild processes - which would normally complete in a
> > few minutes - were taking hours to run. The system was under such heavy
> > load, presumably from all the garbage collecting and swapping, that it
> was
> > unresponsive to web requests and SSH connections.
> >
> > On Friday 10th February INFRA-24185 was raised, requesting an additional
> > 4GB of RAM for ofbiz-vm1. INFRA approved the request and plans were made
> to
> > shut down the VM to perform the upgrade the following Monday. However the
> > 03:00 UTC scheduled rebuild didn't agree with the plan!
> >
> > The next day (Saturday 11th February), due to the 03:00 rebuild, the VM
> was
> > unresponsive again (
> > https://lists.apache.org/thread/dh0dp8f3n8jyy8n3jbho7vsghy42f41f). INFRA
> > shut down the VM for us in response and performed the upgrade ahead of
> > schedule. They upgraded us from a 2vCPU 8GB VM to 8vCPU 32GB VM. A
> 4-times
> > increase in compute and memory - an unexpected and very much appreciated
> > upgrade!
>
> Thanks Daniel,
>
> I just want to add that we crossed another quite weird issue reported at
> https://issues.apache.org/jira/browse/INFRA-24303
> "brianb user (Brian Behlendorf) is weirdly running the trunk demo on
> ofbiz-vm1.apache.org VM."
> It's a random issue. We have no solution for now.
>
> Jacques
>
>

-- 
Daniel Watford

Re: Update on experimenting with docker deployments of OFBiz

Posted by Jacques Le Roux <ja...@les7arts.com>.
Le 09/03/2023 à 12:34, Daniel Watford a écrit :
> Difficulties experienced along the way:
>
> The biggest problem we hit was a capacity issue on the OFBiz project's VM,
> ofbiz-vm1.apache.org.
>
> A good rule of thumb seemed to be that an OFBiz demo instance consumed
> around 2GB of RAM. Judicious use of memory and cpu limits on the containers
> to be deployed on ofbiz-vm1 mean we could squeeze a container instance of
> OFBiz alongside the existing 3 demo sites and still fit into the total 8GB
> of RAM available on the host.
>
> Everything seemed to be going well… until!  An OFBiz demo instance would
> consume around 2GB of RAM, but to build that instance would take much more
> - perhaps another 2GB. OFBiz demo instances are rebuilt daily at 03:00 UTC
> and suddenly these rebuild processes - which would normally complete in a
> few minutes - were taking hours to run. The system was under such heavy
> load, presumably from all the garbage collecting and swapping, that it was
> unresponsive to web requests and SSH connections.
>
> On Friday 10th February INFRA-24185 was raised, requesting an additional
> 4GB of RAM for ofbiz-vm1. INFRA approved the request and plans were made to
> shut down the VM to perform the upgrade the following Monday. However the
> 03:00 UTC scheduled rebuild didn't agree with the plan!
>
> The next day (Saturday 11th February), due to the 03:00 rebuild, the VM was
> unresponsive again (
> https://lists.apache.org/thread/dh0dp8f3n8jyy8n3jbho7vsghy42f41f). INFRA
> shut down the VM for us in response and performed the upgrade ahead of
> schedule. They upgraded us from a 2vCPU 8GB VM to 8vCPU 32GB VM. A 4-times
> increase in compute and memory - an unexpected and very much appreciated
> upgrade!

Thanks Daniel,

I just want to add that we crossed another quite weird issue reported at
https://issues.apache.org/jira/browse/INFRA-24303
"brianb user (Brian Behlendorf) is weirdly running the trunk demo on ofbiz-vm1.apache.org VM."
It's a random issue. We have no solution for now.

Jacques