You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jena.apache.org by Reto Gmür <re...@apache.org> on 2015/09/10 14:56:35 UTC

Fuseki docker

Hi Stian, all,

I've been using your docker image from dockerhub. However it is using quite
an old version of Fuseki and the last commit in the respective branch of
your fork has been a while ago.

Areyou planning on updating this tothe current fuseki version or is there
another dockerized version of Fuseki.

Cheers,
Reto

Re: Fuseki docker

Posted by Andy Seaborne <an...@apache.org>.

On 20/09/15 21:09, Stian Soiland-Reyes wrote:
> Which command are you running, with which image?
>
> I assume you mean something like
>
>      docker run --rm -v /tmp/1:/fuseki stain/jena-fuseki

In an exploration of the database management issues about (not) loosing 
state across container restarts ...

DIR="/home/afs/jlib/apache-jena-fuseki-2.3.0"
DATA="$PWD/DB"

docker run -t --rm=true --name Fuseki2 \
        -v ${DIR}:/fuseki \
        -v ${DATA}:/fuseki/DB \
        --net=host \
        java:8 \
        env FUSEKI_HOME=/fuseki /fuseki/fuseki-server --update 
--loc=/fuseki/DB /ds



--mem : no problems.

>
> If it is fuseki, it would not terrminate by itself, so you used ctrl-C?

Yes

>
> What are the errors..?

Can't run a second time because of --name.  Can't delete the temporary 
layers.  Need to kill docker to unblock.

> Which Docker and Linux OS/kernel versions?

Latest / 1.8.2, and Ubuntu 15.04 latest / 3.19.0-21-generic.

> Do you have a use pattern to cause the issue, e.g. must some queries be run
> first?

run, ^C

>
> Where does the volume come from? Local folder, data container or something
> external?
>
> When you say zombie, is that true zombie according to ps, or just that
> docker rm won't work?
>
> What does docker ps -a say about this container?
>
> The preferred way to stop a running Docker container is
>
>      docker stop fuseki
>
> https://docs.docker.com/reference/commandline/stop/
>
> This is SIGTERM, followed by SIGKILL after (by default) 10 seconds.
>
> After this stops you can use
>
>      docker rm -v fuseki
>
> docker rm on a running container only works with -f, which sends the
> nastier SIGKILL right away (note, SIGKILL is not handed over to the JVM, so
> the kernel would need to tidy the file handles)


> I'm finding that "docker --rm -v" does not work reliably when mapping in a
> TDB database directory.  The container is left running, seemingly in some
> zombie state.  This happen more often than not.
>
> My guess is that some of mmap file clearup is still pending when the
> process exits (or rather, the process becomes a linux zombie or somesuch)
> and the container is not removable.
>
> Whether this is a docker problem, a kernel problem or a fact of linux life
> is not clear.
>
> No --rm and an immediate "docker kill; docker rm" has worked each time.
> (timing!)
>
> I am running docker direct - no VM (I'm on a linux kernel on the hardware)
>
> Has anyone else seen anything like this?
>
>          Andy
>

Re: Fuseki docker

Posted by Stian Soiland-Reyes <st...@apache.org>.

Which command are you running, with which image?

I assume you mean something like

    docker run --rm -v /tmp/1:/fuseki stain/jena-fuseki

If it is fuseki, it would not terrminate by itself, so you used ctrl-C?

What are the errors..?

Which Docker and Linux OS/kernel versions?

Do you have a use pattern to cause the issue, e.g. must some queries be run
first?

Where does the volume come from? Local folder, data container or something
external?

When you say zombie, is that true zombie according to ps, or just that
docker rm won't work?

What does docker ps -a say about this container?

The preferred way to stop a running Docker container is

    docker stop fuseki

https://docs.docker.com/reference/commandline/stop/

This is SIGTERM, followed by SIGKILL after (by default) 10 seconds.

After this stops you can use

    docker rm -v fuseki

docker rm on a running container only works with -f, which sends the
nastier SIGKILL right away (note, SIGKILL is not handed over to the JVM, so
the kernel would need to tidy the file handles)
I'm finding that "docker --rm -v" does not work reliably when mapping in a
TDB database directory.  The container is left running, seemingly in some
zombie state.  This happen more often than not.

My guess is that some of mmap file clearup is still pending when the
process exits (or rather, the process becomes a linux zombie or somesuch)
and the container is not removable.

Whether this is a docker problem, a kernel problem or a fact of linux life
is not clear.

No --rm and an immediate "docker kill; docker rm" has worked each time.
(timing!)

I am running docker direct - no VM (I'm on a linux kernel on the hardware)

Has anyone else seen anything like this?

        Andy

Re: Fuseki docker

Posted by Andy Seaborne <an...@apache.org>.

I'm finding that "docker --rm -v" does not work reliably when mapping in 
a TDB database directory.  The container is left running, seemingly in 
some zombie state.  This happen more often than not.

My guess is that some of mmap file clearup is still pending when the 
process exits (or rather, the process becomes a linux zombie or 
somesuch)  and the container is not removable.

Whether this is a docker problem, a kernel problem or a fact of linux 
life is not clear.

No --rm and an immediate "docker kill; docker rm" has worked each time. 
(timing!)

I am running docker direct - no VM (I'm on a linux kernel on the hardware)

Has anyone else seen anything like this?

	Andy

Re: Fuseki docker

Posted by Stian Soiland-Reyes <st...@apache.org>.

 On 11 September 2015 at 09:29, Andy Seaborne <an...@apache.org> wrote:

> (hmm - ATM it is showing the old one. Are we seeing eventual consistency in
> action as the builder sweeps around?)

for the 2.3.0 tag I just updated my older "manual" approach - which
uses a different base image without Maven.  (as my PR is based on the
latest master)

I can make an alternative 2.3.0 based on the pull-request Maven
approach if I back-date the <version> in the pom.xml - but I thought
that would be a bit cheeky.

You can test the Pull Request docker image as docker run
stain/jena-fuseki:devel -- this would be using the 2.3.1-SNAPSHOT at
the time of build.

> There are 11 Fuseki's (some are Fuseki1) on hub.docker
>
> Are they all effectively the same?
> Anything to learn from the others?

> Better? Different?
>
> So there is a question of whether the community approach is better than a
> single project form.  Thoughts?

Many of them are lacking documentation.

Some are outdated.

The better ones are focused on the Fuseki 1 approach of manually
editing config files -- I think a main purpose of a Docker image is to
quickly get started - it should work out of the box. This fits very
well with Fuseki 2.

In my approach I also included instructions on how to use the
tdbloader as a way to populate the store.

Once you need more heavy production usage of course you will want your
own config files - so I don't think that should be 'hidden' - just
left as a second step.

It should be noted that last week or so the doc of
https://hub.docker.com/r/stain/jena-fuseki/ broke all hyper links due
to https://github.com/docker/hub-feedback/issues/195 :)

I guess that's life in the fast lane..

I wouldn't want to stand here and say that what I cooked together is
the Best And Only approach! There is also the issue of timing.

When I made my Docker image, Fuseki 2 was still experimental, and
there were no other Fuseki 2 images.

But I think if the project takes ownership, then discussion about how
Docker image should be and evolve can happen within the Jena community
rather than as spread out comments on Alice' and Bob's private
approaches.

-- 
Stian Soiland-Reyes
Apache Taverna (incubating), Apache Commons RDF (incubating)
http://orcid.org/0000-0001-9842-9718

Re: Fuseki docker

Posted by Andy Seaborne <an...@apache.org>.

On 11/09/15 00:16, Stian Soiland-Reyes wrote:
> I have updated https://hub.docker.com/r/stain/jena-fuseki/ to have
> Fuseki 2.3.0 on the 'latest' and 2.3.0 tags.
>
> docker pull stain/jena-fuseki should pull 2.3.0

Great.

It has a different base image to the PR.

(hmm - ATM it is showing the old one. Are we seeing eventual consistency 
in action as the builder sweeps around?)

> I would prefer to not to have this under /stain/ so it can be made
> official and managed by more than just me.
>
>
>
> I'm afraid progress in adapting the docker image was stalled here:
>
> https://github.com/apache/jena/pull/50
> https://issues.apache.org/jira/browse/JENA-909

There are 11 Fuseki's (some are Fuseki1) on hub.docker

Are they all effectively the same?
Anything to learn from the others?

Better? Different?

So there is a question of whether the community approach is better than 
a single project form.  Thoughts?

> I have rebased the pull request for the latest master. We should still
> think about how the release process would work before such a pull
> request is accepted.
>
>
>
> The pull request adds a jena-fuseki2/jena-fuseki-docker module - which
> only purpose during normal Jena build is for its version number to be
> updated as part of the release process. Within the Dockerfile it
> downloads the apache-jena-fuseki.tar.gz distribution from Maven
> Central (or http://repository.apache.org/snapshots)
>
> This makes a convoluted Docker build than strictly necessary, as it
> uses Maven only to resolve apache-jena-fuseki.tar.gz. It does however
> fit well into the current source code hierarchy and it means
> downstream users can just run "docker build ." in that folder to build
> it for themselves.
>
> https://github.com/stain/jena/blob/fuseki2-docker-maven/jena-fuseki2/jena-fuseki-docker/Dockerfile
>
>
>
> A simple approach (the one Andy mentioned that I tried first) is to
> just have a Dockerfile that downloads the specific release, e.g.
>
> https://github.com/stain/jena/blob/fuseki2-docker/jena-fuseki2/jena-fuseki-docker/Dockerfile
>
> This would fit better in a separate git branch/repository or at least
> outside of the Maven build (as it would have to be updated manually
> after a release).  Now should such a repository be owned by Apache
> Jena? I think it should. If not, then it should be a separate
> Dockerfile-only project - like I did for Virtuoso:
> https://github.com/stain/virtuoso-docker
>
> ((.. but this could potentially raise ASF trademark alarms!))
>
>
>
>
>
>
> Even with Maven updating this automatically, manual updates of the
> docker hub on https://hub.docker.com/ would still be required.
> Normally the Automated Build works well, but it can only be set up
> from a GitHub repository you have control over -- but Jena committers
> can't (easily) register automated builds against
> https://github.com/apache/jena/ (it's a read-only mirror) -- with a
> bit of push and luck INFRA could possibly do it though.
>
> One alternative here could be to push it manually to the Docker hub.
>
> This could be added to the Maven build with some Docker-maven plugins
> -- but this would mean the Jena Release Manager would have to have
> Docker installed to complete the release.
>
>
> Should the Docker image be covered by the [RELEASE] vote?  By putting
> it into the Maven source build that is voted over anyway, and by the
> dockerfile just putting the jena-fuseki distribution tar.gz that was
> pushed to Maven Central by the same vote - and thus by my reasoning
> pushing this elsewhere is not a new release or new binary, more like a
> kind of rewrapping (like changing a .zip to .tar.gz) - your views
> might differ.
>
>
>
>
>
> More appropriate would be to register as an official image (e.g.
> "jena", "jena-fuseki" or "fuseki") - see
> https://github.com/docker-library/official-images -- this is a bit
> more work to do properly and generally puts the Dockerfile content in
> https://github.com/docker-library/ - see for instance
> https://github.com/docker-library/httpd and
> https://github.com/docker-library/tomcat
>
> .. I think this would mean keeping (quite minimal) source code outside
> Apache's control and not quite by ASF policy as for a new release you
> would need to afterwards need to submit changes to the
> https://github.com/docker-library/jena library.
>
>
> You can also refer to external repositories like in
> https://github.com/docker-library/official-images/blob/master/library/nginx
>   - going this route we could just refer to the commits on
> https://github.com/apache/jena and even use the same commit ID as used
> in the VOTE.   It is this I was thinking of with my maven approach, as
> it means 'anyone' can raise a pull request against jena in
> https://github.com/docker-library/official-images/ to just point to
> commit IDs of later releases - the Jena source code would have already
> the right Docker bits in it.
>
>
> I had hoped we would go down the route towards the official image
>
>
> I mean official as in gradually moving down:
> - Dockerfile Source code owned/managed by Apache Jena
> - Docker Hub entry uploaded/updated by Apache Jena committers / Release Manager
> - Docker image linked to from Fuseki documentation
> - Official "_" Docker Hub entry
>
>
>
>   - but if there's lacking interest in the Docker image to be officially part of
> Jena in any form, I would rather just split it out and keep
> maintaining it as a personal slightly outdated thing.
>
>
> On 10 September 2015 at 15:50, Andy Seaborne <an...@apache.org> wrote:
>> On 10/09/15 13:56, Reto Gmür wrote:
>>>
>>> Hi Stian, all,
>>>
>>> I've been using your docker image from dockerhub. However it is using
>>> quite
>>> an old version of Fuseki and the last commit in the respective branch of
>>> your fork has been a while ago.
>>>
>>> Areyou planning on updating this tothe current fuseki version or is there
>>> another dockerized version of Fuseki.
>>>
>>> Cheers,
>>> Reto
>>>
>>
>> The Dockerfile has "ENV FUSEKI_VERSION 2.0.0"  That, and the java7 root,
>> should be all that needs changing.
>>
>>
>> FYI: to run containerized from a local Fuseki distribution:
>>
>> Simple "run Fuseki" script for docker:
>> (may need local variation e.g volume)
>>
>> "localhost" works in this variation.
>>
>> -----------------------------------------
>> #!/bin/bash
>>
>> # Installation directory : change as needed.
>> DIR="....../apache-jena-fuseki-2.3.0"
>> # Caution: --net=host
>>
>> docker run -it --name Fuseki2 \
>>         -v ${DIR}:/fuseki \
>>         --net=host \
>>         java:8 \
>>         env FUSEKI_HOME=/fuseki /fuseki/fuseki-server --update --mem /ds
>> -----------------------------------------
>
>
>

Re: Fuseki docker

Posted by Stian Soiland-Reyes <st...@apache.org>.

I have updated https://hub.docker.com/r/stain/jena-fuseki/ to have
Fuseki 2.3.0 on the 'latest' and 2.3.0 tags.

docker pull stain/jena-fuseki should pull 2.3.0

I would prefer to not to have this under /stain/ so it can be made
official and managed by more than just me.

I'm afraid progress in adapting the docker image was stalled here:

https://github.com/apache/jena/pull/50
https://issues.apache.org/jira/browse/JENA-909

I have rebased the pull request for the latest master. We should still
think about how the release process would work before such a pull
request is accepted.

The pull request adds a jena-fuseki2/jena-fuseki-docker module - which
only purpose during normal Jena build is for its version number to be
updated as part of the release process. Within the Dockerfile it
downloads the apache-jena-fuseki.tar.gz distribution from Maven
Central (or http://repository.apache.org/snapshots)

This makes a convoluted Docker build than strictly necessary, as it
uses Maven only to resolve apache-jena-fuseki.tar.gz. It does however
fit well into the current source code hierarchy and it means
downstream users can just run "docker build ." in that folder to build
it for themselves.

https://github.com/stain/jena/blob/fuseki2-docker-maven/jena-fuseki2/jena-fuseki-docker/Dockerfile

A simple approach (the one Andy mentioned that I tried first) is to
just have a Dockerfile that downloads the specific release, e.g.

https://github.com/stain/jena/blob/fuseki2-docker/jena-fuseki2/jena-fuseki-docker/Dockerfile

This would fit better in a separate git branch/repository or at least
outside of the Maven build (as it would have to be updated manually
after a release).  Now should such a repository be owned by Apache
Jena? I think it should. If not, then it should be a separate
Dockerfile-only project - like I did for Virtuoso:
https://github.com/stain/virtuoso-docker

((.. but this could potentially raise ASF trademark alarms!))

Even with Maven updating this automatically, manual updates of the
docker hub on https://hub.docker.com/ would still be required.
Normally the Automated Build works well, but it can only be set up
from a GitHub repository you have control over -- but Jena committers
can't (easily) register automated builds against
https://github.com/apache/jena/ (it's a read-only mirror) -- with a
bit of push and luck INFRA could possibly do it though.

One alternative here could be to push it manually to the Docker hub.

This could be added to the Maven build with some Docker-maven plugins
-- but this would mean the Jena Release Manager would have to have
Docker installed to complete the release.

Should the Docker image be covered by the [RELEASE] vote?  By putting
it into the Maven source build that is voted over anyway, and by the
dockerfile just putting the jena-fuseki distribution tar.gz that was
pushed to Maven Central by the same vote - and thus by my reasoning
pushing this elsewhere is not a new release or new binary, more like a
kind of rewrapping (like changing a .zip to .tar.gz) - your views
might differ.

More appropriate would be to register as an official image (e.g.
"jena", "jena-fuseki" or "fuseki") - see
https://github.com/docker-library/official-images -- this is a bit
more work to do properly and generally puts the Dockerfile content in
https://github.com/docker-library/ - see for instance
https://github.com/docker-library/httpd and
https://github.com/docker-library/tomcat

.. I think this would mean keeping (quite minimal) source code outside
Apache's control and not quite by ASF policy as for a new release you
would need to afterwards need to submit changes to the
https://github.com/docker-library/jena library.

You can also refer to external repositories like in
https://github.com/docker-library/official-images/blob/master/library/nginx
 - going this route we could just refer to the commits on
https://github.com/apache/jena and even use the same commit ID as used
in the VOTE.   It is this I was thinking of with my maven approach, as
it means 'anyone' can raise a pull request against jena in
https://github.com/docker-library/official-images/ to just point to
commit IDs of later releases - the Jena source code would have already
the right Docker bits in it.

I had hoped we would go down the route towards the official image

I mean official as in gradually moving down:
- Dockerfile Source code owned/managed by Apache Jena
- Docker Hub entry uploaded/updated by Apache Jena committers / Release Manager
- Docker image linked to from Fuseki documentation
- Official "_" Docker Hub entry

 - but if there's lacking interest in the Docker image to be officially part of
Jena in any form, I would rather just split it out and keep
maintaining it as a personal slightly outdated thing.

On 10 September 2015 at 15:50, Andy Seaborne <an...@apache.org> wrote:
> On 10/09/15 13:56, Reto Gmür wrote:
>>
>> Hi Stian, all,
>>
>> I've been using your docker image from dockerhub. However it is using
>> quite
>> an old version of Fuseki and the last commit in the respective branch of
>> your fork has been a while ago.
>>
>> Areyou planning on updating this tothe current fuseki version or is there
>> another dockerized version of Fuseki.
>>
>> Cheers,
>> Reto
>>
>
> The Dockerfile has "ENV FUSEKI_VERSION 2.0.0"  That, and the java7 root,
> should be all that needs changing.
>
>
> FYI: to run containerized from a local Fuseki distribution:
>
> Simple "run Fuseki" script for docker:
> (may need local variation e.g volume)
>
> "localhost" works in this variation.
>
> -----------------------------------------
> #!/bin/bash
>
> # Installation directory : change as needed.
> DIR="....../apache-jena-fuseki-2.3.0"
> # Caution: --net=host
>
> docker run -it --name Fuseki2 \
>        -v ${DIR}:/fuseki \
>        --net=host \
>        java:8 \
>        env FUSEKI_HOME=/fuseki /fuseki/fuseki-server --update --mem /ds
> -----------------------------------------

-- 
Stian Soiland-Reyes
Apache Taverna (incubating), Apache Commons RDF (incubating)
http://orcid.org/0000-0001-9842-9718

Re: Fuseki docker

Posted by Andy Seaborne <an...@apache.org>.

On 10/09/15 13:56, Reto Gmür wrote:
> Hi Stian, all,
>
> I've been using your docker image from dockerhub. However it is using quite
> an old version of Fuseki and the last commit in the respective branch of
> your fork has been a while ago.
>
> Areyou planning on updating this tothe current fuseki version or is there
> another dockerized version of Fuseki.
>
> Cheers,
> Reto
>

The Dockerfile has "ENV FUSEKI_VERSION 2.0.0"  That, and the java7 root, 
should be all that needs changing.


FYI: to run containerized from a local Fuseki distribution:

Simple "run Fuseki" script for docker:
(may need local variation e.g volume)

"localhost" works in this variation.

-----------------------------------------
#!/bin/bash

# Installation directory : change as needed.
DIR="....../apache-jena-fuseki-2.3.0"
# Caution: --net=host

docker run -it --name Fuseki2 \
        -v ${DIR}:/fuseki \
        --net=host \
        java:8 \
        env FUSEKI_HOME=/fuseki /fuseki/fuseki-server --update --mem /ds
-----------------------------------------