You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@joshua.apache.org by mj...@apache.org on 2017/03/03 15:01:43 UTC
[1/7] incubator-joshua git commit: added docker file to build KenLM
Repository: incubator-joshua
Updated Branches:
refs/heads/kenlm_docker 493ece5e9 -> fd94d889d
refs/heads/master d49988e54 -> d86c3441b
added docker file to build KenLM
Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/4df65430
Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/4df65430
Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/4df65430
Branch: refs/heads/master
Commit: 4df6543067755aa5272d495dc417af79dbcb5edf
Parents: 47e0d6c
Author: Matt Post <po...@cs.jhu.edu>
Authored: Tue Jan 31 20:28:58 2017 -0500
Committer: Matt Post <po...@cs.jhu.edu>
Committed: Thu Mar 2 13:12:49 2017 -0500
----------------------------------------------------------------------
distribution/docker/kenlm/Dockerfile | 41 +++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/4df65430/distribution/docker/kenlm/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/kenlm/Dockerfile b/distribution/docker/kenlm/Dockerfile
new file mode 100644
index 0000000..57f3716
--- /dev/null
+++ b/distribution/docker/kenlm/Dockerfile
@@ -0,0 +1,41 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+FROM maven:latest
+
+LABEL Description="Builds the KenLM library for use with language packs" Vendor="Apache Software Foundation"
+
+RUN apt-get update && \
+ apt-get install -y \
+ cmake \
+ git \
+ g++ \
+ libboost-all-dev \
+ libbz2-dev \
+ libeigen3-dev \
+ liblzma-dev \
+ libz-dev \
+ make \
+ curl
+
+# set environment variables
+ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
+ENV JOSHUA=/opt/joshua
+
+# download Joshua source
+RUN mkdir /opt/joshua
+WORKDIR /opt/joshua
+RUN curl -L https://api.github.com/repos/apache/incubator-joshua/tarball | tar --strip-components=1 -xzvf -
+RUN echo y | bash download-deps.sh kenlm
[7/7] incubator-joshua git commit: Merge branch 'kenlm_docker'
Posted by mj...@apache.org.
Merge branch 'kenlm_docker'
Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/d86c3441
Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/d86c3441
Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/d86c3441
Branch: refs/heads/master
Commit: d86c3441baf93eb0d9207b548d1b5efa0913f44a
Parents: d49988e fd94d88
Author: Matt Post <po...@cs.jhu.edu>
Authored: Fri Mar 3 10:01:14 2017 -0500
Committer: Matt Post <po...@cs.jhu.edu>
Committed: Fri Mar 3 10:01:14 2017 -0500
----------------------------------------------------------------------
distribution/docker/Dockerfile | 47 -----------------------
distribution/docker/README.md | 11 ++++++
distribution/docker/ar-en-phrase/Dockerfile | 26 -------------
distribution/docker/dev/Dockerfile | 47 +++++++++++++++++++++++
distribution/docker/kenlm/Dockerfile | 49 ++++++++++++++++++++++++
distribution/docker/kenlm/README.md | 13 +++++++
distribution/docker/zh-en-hiero/Dockerfile | 26 -------------
scripts/language-pack/VERSIONS | 18 +++++++++
8 files changed, 138 insertions(+), 99 deletions(-)
----------------------------------------------------------------------
[4/7] incubator-joshua git commit: added full path to config file
Posted by mj...@apache.org.
added full path to config file
Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/43f7a360
Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/43f7a360
Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/43f7a360
Branch: refs/heads/master
Commit: 43f7a3607dcc574ec1e702477dd3f3061a844c46
Parents: 493ece5
Author: Matt Post <po...@cs.jhu.edu>
Authored: Fri Mar 3 09:58:33 2017 -0500
Committer: Matt Post <po...@cs.jhu.edu>
Committed: Fri Mar 3 09:58:33 2017 -0500
----------------------------------------------------------------------
distribution/docker/kenlm/Dockerfile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/43f7a360/distribution/docker/kenlm/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/kenlm/Dockerfile b/distribution/docker/kenlm/Dockerfile
index 4927976..0feed0b 100644
--- a/distribution/docker/kenlm/Dockerfile
+++ b/distribution/docker/kenlm/Dockerfile
@@ -46,4 +46,4 @@ RUN echo y | bash download-deps.sh kenlm
# TODO: check that the LP version is correct
# start Joshua
-ENTRYPOINT /model/joshua -config joshua.config.kenlm -server-type http -server-port $PORT -v 1 $JOSHUA_ARGS
+ENTRYPOINT /model/joshua -config /model/joshua.config.kenlm -server-type http -server-port $PORT -v 1 $JOSHUA_ARGS
[5/7] incubator-joshua git commit: added language pack VERSIONS file
Posted by mj...@apache.org.
added language pack VERSIONS file
Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/fd94d889
Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/fd94d889
Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/fd94d889
Branch: refs/heads/master
Commit: fd94d889df2d47a5bb369dbbc031ef1df7167490
Parents: 43f7a36
Author: Matt Post <po...@cs.jhu.edu>
Authored: Fri Mar 3 10:01:05 2017 -0500
Committer: Matt Post <po...@cs.jhu.edu>
Committed: Fri Mar 3 10:01:05 2017 -0500
----------------------------------------------------------------------
scripts/language-pack/VERSIONS | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/fd94d889/scripts/language-pack/VERSIONS
----------------------------------------------------------------------
diff --git a/scripts/language-pack/VERSIONS b/scripts/language-pack/VERSIONS
new file mode 100644
index 0000000..e3dd10b
--- /dev/null
+++ b/scripts/language-pack/VERSIONS
@@ -0,0 +1,18 @@
+# Version 3 (March 2017)
+
+This was the first version actually versioned. It was built to work with docker building
+a KenLM language model.
+
+Includes KenLM language model files (recommended) in addition to BerkeleyLM.
+The latter is the default, with the former recommended and facilitated with a Docker
+container. Google API now multithreaded. Contained the new files:
+
+- joshua.config.kenlm (same config file but with KenLM instead of BerkeleyLM)
+- lp.conf (identifying the LP version and the git commit of the code)
+
+# Version 1-2 (prior to March 2017)
+
+These versions were not explicitly identified. They contained a "joshua" top-level script
+and "prepare.sh" for preparing data. Operates in server mode or from the command line.
+Entirely BerkeleyLM-based. Includes a Joshua 6.1 release candidate jar file.
+
[2/7] incubator-joshua git commit: updated docker files for kenlm +
language pack support
Posted by mj...@apache.org.
updated docker files for kenlm + language pack support
There were some docker files but they were somewhat out of date. This push introduces
a new KenLM docker image which builds KenLM and makes use of the joshua.config.kenlm
file that is found in language packs version 3+.
Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/493ece5e
Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/493ece5e
Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/493ece5e
Branch: refs/heads/master
Commit: 493ece5e9e18da8eac406ece9311c7c4e3140437
Parents: 4df6543
Author: Matt Post <po...@cs.jhu.edu>
Authored: Thu Mar 2 15:59:18 2017 -0500
Committer: Matt Post <po...@cs.jhu.edu>
Committed: Thu Mar 2 15:59:18 2017 -0500
----------------------------------------------------------------------
distribution/docker/Dockerfile | 47 ------------------------
distribution/docker/README.md | 11 ++++++
distribution/docker/ar-en-phrase/Dockerfile | 26 -------------
distribution/docker/dev/Dockerfile | 47 ++++++++++++++++++++++++
distribution/docker/kenlm/Dockerfile | 16 ++++++--
distribution/docker/kenlm/README.md | 13 +++++++
distribution/docker/zh-en-hiero/Dockerfile | 26 -------------
7 files changed, 83 insertions(+), 103 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/Dockerfile b/distribution/docker/Dockerfile
deleted file mode 100644
index 949865d..0000000
--- a/distribution/docker/Dockerfile
+++ /dev/null
@@ -1,47 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-FROM maven:latest
-
-LABEL Description="This image is used to provide a Joshua Decoder environment" Vendor="Apache Software Foundation"
-
-RUN apt-get update && \
- apt-get install -y \
- cmake \
- git \
- g++ \
- libboost-all-dev \
- libbz2-dev \
- libeigen3-dev \
- liblzma-dev \
- libz-dev \
- make \
- ant
-
-
-RUN mkdir /opt/joshua
-WORKDIR /opt/joshua
-
-# set environment variables
-ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
-ENV JOSHUA=/opt/joshua
-
-
-# copy Joshua source code to image
-COPY . $JOSHUA
-
-RUN sh download-deps.sh
-
-RUN mvn package
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/README.md
----------------------------------------------------------------------
diff --git a/distribution/docker/README.md b/distribution/docker/README.md
new file mode 100644
index 0000000..9fff788
--- /dev/null
+++ b/distribution/docker/README.md
@@ -0,0 +1,11 @@
+This directory contains files for using Joshua with Docker.
+
+- dev/Dockerfile
+
+This will help you compile the development version of Joshua, including the 3rd party
+libraries and support tools.
+
+- kenlm/Dockerfile
+
+This is used by the language packs for getting the runtime version of Joshua to work
+with KenLM.
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/ar-en-phrase/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/ar-en-phrase/Dockerfile b/distribution/docker/ar-en-phrase/Dockerfile
deleted file mode 100644
index 5b7d94c..0000000
--- a/distribution/docker/ar-en-phrase/Dockerfile
+++ /dev/null
@@ -1,26 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-FROM joshua
-
-ENV language_pack=ar-en-phrase
-
-RUN mkdir /opt/$language_pack
-WORKDIR /opt/$language_pack
-
-RUN curl http://cs.jhu.edu/~post/language-packs/language-pack-ar-en-phrase-2015-03-18.tgz \
- | tar xz --strip-components=1
-
-ENTRYPOINT ["./run-joshua.sh"]
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/dev/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/dev/Dockerfile b/distribution/docker/dev/Dockerfile
new file mode 100644
index 0000000..949865d
--- /dev/null
+++ b/distribution/docker/dev/Dockerfile
@@ -0,0 +1,47 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+FROM maven:latest
+
+LABEL Description="This image is used to provide a Joshua Decoder environment" Vendor="Apache Software Foundation"
+
+RUN apt-get update && \
+ apt-get install -y \
+ cmake \
+ git \
+ g++ \
+ libboost-all-dev \
+ libbz2-dev \
+ libeigen3-dev \
+ liblzma-dev \
+ libz-dev \
+ make \
+ ant
+
+
+RUN mkdir /opt/joshua
+WORKDIR /opt/joshua
+
+# set environment variables
+ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
+ENV JOSHUA=/opt/joshua
+
+
+# copy Joshua source code to image
+COPY . $JOSHUA
+
+RUN sh download-deps.sh
+
+RUN mvn package
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/kenlm/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/kenlm/Dockerfile b/distribution/docker/kenlm/Dockerfile
index 57f3716..4927976 100644
--- a/distribution/docker/kenlm/Dockerfile
+++ b/distribution/docker/kenlm/Dockerfile
@@ -32,10 +32,18 @@ RUN apt-get update && \
# set environment variables
ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
-ENV JOSHUA=/opt/joshua
+ENV JOSHUA=/code
+ENV PORT=5674
+ENV JOSHUA_ARGS=""
+ENV LD_LIBRARY_PATH=$JOSHUA/lib
-# download Joshua source
-RUN mkdir /opt/joshua
-WORKDIR /opt/joshua
+# download Joshua source and compile KenLM
+RUN mkdir -p /code
+WORKDIR /code
RUN curl -L https://api.github.com/repos/apache/incubator-joshua/tarball | tar --strip-components=1 -xzvf -
RUN echo y | bash download-deps.sh kenlm
+
+# TODO: check that the LP version is correct
+
+# start Joshua
+ENTRYPOINT /model/joshua -config joshua.config.kenlm -server-type http -server-port $PORT -v 1 $JOSHUA_ARGS
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/kenlm/README.md
----------------------------------------------------------------------
diff --git a/distribution/docker/kenlm/README.md b/distribution/docker/kenlm/README.md
new file mode 100644
index 0000000..a6d5900
--- /dev/null
+++ b/distribution/docker/kenlm/README.md
@@ -0,0 +1,13 @@
+This Docker container installs KenLM and uses it to start a language pack with KenLM
+language models instead of BerkeleyLM ones. It requires version 3 or above language packs.
+
+To use it, you need to do two things when running docker:
+
+- Mount the version 3 language pack to /model
+- Choose a local (host) port and bind it to the docker port that Joshua will run on
+
+This can be accomplished with the following command:
+
+ docker run -p 127.0.0.1:5674:5674 -v /path/to/LP:/model -it joshua/kenlm
+
+This will make the language pack available on port 5674 on localhost.
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/493ece5e/distribution/docker/zh-en-hiero/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/zh-en-hiero/Dockerfile b/distribution/docker/zh-en-hiero/Dockerfile
deleted file mode 100644
index 4c099a1..0000000
--- a/distribution/docker/zh-en-hiero/Dockerfile
+++ /dev/null
@@ -1,26 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-FROM joshua
-
-ENV language_pack=zh-en-hiero
-
-RUN mkdir /opt/$language_pack
-WORKDIR /opt/$language_pack
-
-RUN curl http://cs.jhu.edu/~post/language-packs/zh-en-hiero-2016-01-13.tgz \
- | tar xz --strip-components=1
-
-ENTRYPOINT ["./run-joshua.sh"]
[6/7] incubator-joshua git commit: added language pack VERSIONS file
Posted by mj...@apache.org.
added language pack VERSIONS file
Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/fd94d889
Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/fd94d889
Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/fd94d889
Branch: refs/heads/kenlm_docker
Commit: fd94d889df2d47a5bb369dbbc031ef1df7167490
Parents: 43f7a36
Author: Matt Post <po...@cs.jhu.edu>
Authored: Fri Mar 3 10:01:05 2017 -0500
Committer: Matt Post <po...@cs.jhu.edu>
Committed: Fri Mar 3 10:01:05 2017 -0500
----------------------------------------------------------------------
scripts/language-pack/VERSIONS | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/fd94d889/scripts/language-pack/VERSIONS
----------------------------------------------------------------------
diff --git a/scripts/language-pack/VERSIONS b/scripts/language-pack/VERSIONS
new file mode 100644
index 0000000..e3dd10b
--- /dev/null
+++ b/scripts/language-pack/VERSIONS
@@ -0,0 +1,18 @@
+# Version 3 (March 2017)
+
+This was the first version actually versioned. It was built to work with docker building
+a KenLM language model.
+
+Includes KenLM language model files (recommended) in addition to BerkeleyLM.
+The latter is the default, with the former recommended and facilitated with a Docker
+container. Google API now multithreaded. Contained the new files:
+
+- joshua.config.kenlm (same config file but with KenLM instead of BerkeleyLM)
+- lp.conf (identifying the LP version and the git commit of the code)
+
+# Version 1-2 (prior to March 2017)
+
+These versions were not explicitly identified. They contained a "joshua" top-level script
+and "prepare.sh" for preparing data. Operates in server mode or from the command line.
+Entirely BerkeleyLM-based. Includes a Joshua 6.1 release candidate jar file.
+
[3/7] incubator-joshua git commit: added full path to config file
Posted by mj...@apache.org.
added full path to config file
Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/43f7a360
Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/43f7a360
Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/43f7a360
Branch: refs/heads/kenlm_docker
Commit: 43f7a3607dcc574ec1e702477dd3f3061a844c46
Parents: 493ece5
Author: Matt Post <po...@cs.jhu.edu>
Authored: Fri Mar 3 09:58:33 2017 -0500
Committer: Matt Post <po...@cs.jhu.edu>
Committed: Fri Mar 3 09:58:33 2017 -0500
----------------------------------------------------------------------
distribution/docker/kenlm/Dockerfile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/43f7a360/distribution/docker/kenlm/Dockerfile
----------------------------------------------------------------------
diff --git a/distribution/docker/kenlm/Dockerfile b/distribution/docker/kenlm/Dockerfile
index 4927976..0feed0b 100644
--- a/distribution/docker/kenlm/Dockerfile
+++ b/distribution/docker/kenlm/Dockerfile
@@ -46,4 +46,4 @@ RUN echo y | bash download-deps.sh kenlm
# TODO: check that the LP version is correct
# start Joshua
-ENTRYPOINT /model/joshua -config joshua.config.kenlm -server-type http -server-port $PORT -v 1 $JOSHUA_ARGS
+ENTRYPOINT /model/joshua -config /model/joshua.config.kenlm -server-type http -server-port $PORT -v 1 $JOSHUA_ARGS