You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@joshua.apache.org by mj...@apache.org on 2017/03/02 21:00:26 UTC

[1/2] incubator-joshua git commit: added docker file to build KenLM

Repository: incubator-joshua
Updated Branches:
  refs/heads/kenlm_docker [created] 493ece5e9


added docker file to build KenLM


Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/4df65430
Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/4df65430
Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/4df65430

Branch: refs/heads/kenlm_docker
Commit: 4df6543067755aa5272d495dc417af79dbcb5edf
Parents: 47e0d6c
Author: Matt Post <po...@cs.jhu.edu>
Authored: Tue Jan 31 20:28:58 2017 -0500
Committer: Matt Post <po...@cs.jhu.edu>
Committed: Thu Mar 2 13:12:49 2017 -0500

----------------------------------------------------------------------
 distribution/docker/kenlm/Dockerfile | 41 +++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/4df65430/distribution/docker/kenlm/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/kenlm/Dockerfile b/distribution/docker/kenlm/Dockerfile
new file mode 100644
index 0000000..57f3716
--- /dev/null
+++ b/distribution/docker/kenlm/Dockerfile
@@ -0,0 +1,41 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+FROM maven:latest
+
+LABEL Description="Builds the KenLM library for use with language packs" Vendor="Apache Software Foundation"
+
+RUN apt-get update && \
+    apt-get install -y \
+            cmake \
+            git \
+            g++ \
+            libboost-all-dev \
+            libbz2-dev \
+            libeigen3-dev \
+            liblzma-dev \            
+            libz-dev \
+            make \
+            curl
+
+# set environment variables
+ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
+ENV JOSHUA=/opt/joshua
+
+# download Joshua source 
+RUN mkdir /opt/joshua
+WORKDIR /opt/joshua
+RUN curl -L https://api.github.com/repos/apache/incubator-joshua/tarball | tar --strip-components=1 -xzvf -
+RUN echo y | bash download-deps.sh kenlm


[2/2] incubator-joshua git commit: updated docker files for kenlm + language pack support

Posted by mj...@apache.org.
updated docker files for kenlm + language pack support

There were some docker files but they were somewhat out of date. This push introduces
a new KenLM docker image which builds KenLM and makes use of the joshua.config.kenlm
file that is found in language packs version 3+.


Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/493ece5e
Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/493ece5e
Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/493ece5e

Branch: refs/heads/kenlm_docker
Commit: 493ece5e9e18da8eac406ece9311c7c4e3140437
Parents: 4df6543
Author: Matt Post <po...@cs.jhu.edu>
Authored: Thu Mar 2 15:59:18 2017 -0500
Committer: Matt Post <po...@cs.jhu.edu>
Committed: Thu Mar 2 15:59:18 2017 -0500

----------------------------------------------------------------------
 distribution/docker/Dockerfile              | 47 ------------------------
 distribution/docker/README.md               | 11 ++++++
 distribution/docker/ar-en-phrase/Dockerfile | 26 -------------
 distribution/docker/dev/Dockerfile          | 47 ++++++++++++++++++++++++
 distribution/docker/kenlm/Dockerfile        | 16 ++++++--
 distribution/docker/kenlm/README.md         | 13 +++++++
 distribution/docker/zh-en-hiero/Dockerfile  | 26 -------------
 7 files changed, 83 insertions(+), 103 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/Dockerfile b/distribution/docker/Dockerfile
deleted file mode 100644
index 949865d..0000000
--- a/distribution/docker/Dockerfile
+++ /dev/null
@@ -1,47 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements.  See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License.  You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-FROM maven:latest
-
-LABEL Description="This image is used to provide a Joshua Decoder environment" Vendor="Apache Software Foundation"
-
-RUN apt-get update && \
-    apt-get install -y \
-            cmake \
-            git \
-            g++ \
-            libboost-all-dev \
-            libbz2-dev \
-            libeigen3-dev \
-            liblzma-dev \            
-            libz-dev \
-            make \
-            ant
-
-
-RUN mkdir /opt/joshua
-WORKDIR /opt/joshua
-
-# set environment variables
-ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
-ENV JOSHUA=/opt/joshua
-
-
-# copy Joshua source code to image
-COPY . $JOSHUA
-
-RUN sh download-deps.sh
-
-RUN mvn package

http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/README.md
----------------------------------------------------------------------
diff --git a/distribution/docker/README.md b/distribution/docker/README.md
new file mode 100644
index 0000000..9fff788
--- /dev/null
+++ b/distribution/docker/README.md
@@ -0,0 +1,11 @@
+This directory contains files for using Joshua with Docker.
+
+- dev/Dockerfile 
+
+This will help you compile the development version of Joshua, including the 3rd party
+libraries and support tools.
+
+- kenlm/Dockerfile
+
+This is used by the language packs for getting the runtime version of Joshua to work
+with KenLM.

http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/ar-en-phrase/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/ar-en-phrase/Dockerfile b/distribution/docker/ar-en-phrase/Dockerfile
deleted file mode 100644
index 5b7d94c..0000000
--- a/distribution/docker/ar-en-phrase/Dockerfile
+++ /dev/null
@@ -1,26 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements.  See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License.  You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-FROM joshua
-
-ENV language_pack=ar-en-phrase
-
-RUN mkdir /opt/$language_pack
-WORKDIR /opt/$language_pack
-
-RUN curl http://cs.jhu.edu/~post/language-packs/language-pack-ar-en-phrase-2015-03-18.tgz \
-    | tar xz --strip-components=1
-
-ENTRYPOINT ["./run-joshua.sh"]

http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/dev/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/dev/Dockerfile b/distribution/docker/dev/Dockerfile
new file mode 100644
index 0000000..949865d
--- /dev/null
+++ b/distribution/docker/dev/Dockerfile
@@ -0,0 +1,47 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+FROM maven:latest
+
+LABEL Description="This image is used to provide a Joshua Decoder environment" Vendor="Apache Software Foundation"
+
+RUN apt-get update && \
+    apt-get install -y \
+            cmake \
+            git \
+            g++ \
+            libboost-all-dev \
+            libbz2-dev \
+            libeigen3-dev \
+            liblzma-dev \            
+            libz-dev \
+            make \
+            ant
+
+
+RUN mkdir /opt/joshua
+WORKDIR /opt/joshua
+
+# set environment variables
+ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
+ENV JOSHUA=/opt/joshua
+
+
+# copy Joshua source code to image
+COPY . $JOSHUA
+
+RUN sh download-deps.sh
+
+RUN mvn package

http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/kenlm/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/kenlm/Dockerfile b/distribution/docker/kenlm/Dockerfile
index 57f3716..4927976 100644
--- a/distribution/docker/kenlm/Dockerfile
+++ b/distribution/docker/kenlm/Dockerfile
@@ -32,10 +32,18 @@ RUN apt-get update && \
 
 # set environment variables
 ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
-ENV JOSHUA=/opt/joshua
+ENV JOSHUA=/code
+ENV PORT=5674
+ENV JOSHUA_ARGS=""
+ENV LD_LIBRARY_PATH=$JOSHUA/lib
 
-# download Joshua source 
-RUN mkdir /opt/joshua
-WORKDIR /opt/joshua
+# download Joshua source and compile KenLM
+RUN mkdir -p /code
+WORKDIR /code
 RUN curl -L https://api.github.com/repos/apache/incubator-joshua/tarball | tar --strip-components=1 -xzvf -
 RUN echo y | bash download-deps.sh kenlm
+
+# TODO: check that the LP version is correct
+
+# start Joshua
+ENTRYPOINT /model/joshua -config joshua.config.kenlm -server-type http -server-port $PORT -v 1 $JOSHUA_ARGS

http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/kenlm/README.md
----------------------------------------------------------------------
diff --git a/distribution/docker/kenlm/README.md b/distribution/docker/kenlm/README.md
new file mode 100644
index 0000000..a6d5900
--- /dev/null
+++ b/distribution/docker/kenlm/README.md
@@ -0,0 +1,13 @@
+This Docker container installs KenLM and uses it to start a language pack with KenLM
+language models instead of BerkeleyLM ones. It requires version 3 or above language packs.
+
+To use it, you need to do two things when running docker:
+
+- Mount the version 3 language pack to /model
+- Choose a local (host) port and bind it to the docker port that Joshua will run on
+
+This can be accomplished with the following command:
+
+    docker run -p 127.0.0.1:5674:5674 -v /path/to/LP:/model -it joshua/kenlm
+
+This will make the language pack available on port 5674 on localhost.

http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/zh-en-hiero/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/zh-en-hiero/Dockerfile b/distribution/docker/zh-en-hiero/Dockerfile
deleted file mode 100644
index 4c099a1..0000000
--- a/distribution/docker/zh-en-hiero/Dockerfile
+++ /dev/null
@@ -1,26 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements.  See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License.  You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-FROM joshua
-
-ENV language_pack=zh-en-hiero
-
-RUN mkdir /opt/$language_pack
-WORKDIR /opt/$language_pack
-
-RUN curl http://cs.jhu.edu/~post/language-packs/zh-en-hiero-2016-01-13.tgz \
-    | tar xz --strip-components=1
-
-ENTRYPOINT ["./run-joshua.sh"]