You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@submarine.apache.org by pi...@apache.org on 2022/02/20 14:09:36 UTC
[submarine] branch master updated: SUBMARINE-1195. Reduce jupyter image size
This is an automated email from the ASF dual-hosted git repository.
pingsutw pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/submarine.git
The following commit(s) were added to refs/heads/master by this push:
new 92e6fe8 SUBMARINE-1195. Reduce jupyter image size
92e6fe8 is described below
commit 92e6fe88c5aa2131983d045b4dd35e8f2ed331d4
Author: Thinking <74...@qq.com>
AuthorDate: Sun Feb 20 19:15:51 2022 +0800
SUBMARINE-1195. Reduce jupyter image size
### What is this PR for?
Reduce the image size from 8G to 5G by:
1. Clear cache files after codna installation.
2. Remove node_modules folder after jupyter lab has been built.
3. git clone by `depth=1` and put the deleted folder command in the same layer.
At present, the biggest size of the image are `tensorflow` and `torch` package. This situation has not been solved for this PR.
### What type of PR is it?
Improvement
### Todos
* [x] - Reduce the image size by adjusting the dockerfile
### What is the Jira issue?
https://issues.apache.org/jira/browse/SUBMARINE-1195
### How should this be tested?
The image can be built locally (<https://github.com/shangyuantech/submarine/blob/SUBMARINE-1195/dev-support/docker-images/jupyter/Dockerfile>). At present, I have done a simple juypter page access test locally and python import.
### Screenshots (if appropriate)
No
### Questions:
* Do the license files need updating? No
* Are there breaking changes for older versions? No
* Does this need new documentation? No
Author: Thinking <74...@qq.com>
Signed-off-by: Kevin <pi...@apache.org>
Closes #881 from cdmikechen/SUBMARINE-1195 and squashes the following commits:
36d9eda7 [Thinking] SUBMARINE-1195. Reduce jupyter image size
---
dev-support/docker-images/jupyter/Dockerfile | 44 +++++++++++++++-------------
1 file changed, 24 insertions(+), 20 deletions(-)
diff --git a/dev-support/docker-images/jupyter/Dockerfile b/dev-support/docker-images/jupyter/Dockerfile
index 90a0de0..c5b891c 100644
--- a/dev-support/docker-images/jupyter/Dockerfile
+++ b/dev-support/docker-images/jupyter/Dockerfile
@@ -75,9 +75,10 @@ RUN mv /tini /usr/local/bin/tini && chmod +x /usr/local/bin/tini
# Install conda
USER $NB_UID
ARG PYTHON_VERSION=default
-ENV MINICONDA_VERSION=4.8.3 \
- MINICONDA_MD5=751786b92c00b1aeae3f017b781018df \
- CONDA_VERSION=4.8.3
+# update conda version to 4.11.0
+ENV MINICONDA_VERSION=4.11.0 \
+ MINICONDA_MD5=7675bd23411179956bcc4692f16ef27d \
+ CONDA_VERSION=4.11.0
WORKDIR /tmp
RUN wget --quiet https://repo.continuum.io/miniconda/Miniconda3-py37_${MINICONDA_VERSION}-Linux-x86_64.sh && \
@@ -91,31 +92,34 @@ RUN wget --quiet https://repo.continuum.io/miniconda/Miniconda3-py37_${MINICONDA
conda config --system --set channel_priority strict && \
if [ ! $PYTHON_VERSION = 'default' ]; then conda install --yes python=$PYTHON_VERSION; fi && \
conda list python | grep '^python ' | tr -s ' ' | cut -d '.' -f 1,2 | sed 's/$/.*/' >> $CONDA_DIR/conda-meta/pinned && \
- conda clean --all -f -y && \
- rm -rf /home/$NB_USER/.cache/yarn
-
-RUN conda init bash
-# Install latest sumbarine python sdk and notebook
-RUN source ~/.bashrc && conda activate && \
- git clone https://github.com/apache/submarine && \
- pip install submarine/submarine-sdk/pysubmarine[tf,pytorch] && \
+ conda init bash && \
+ source ~/.bashrc && conda activate && \
+ # it is used for jupyter lab build
conda install nodejs && \
- conda install -c conda-forge jupyterlab jupyterlab-git && \
- jupyter lab build
+ conda install -c conda-forge jupyterlab jupyterlab-git cvxpy==1.0.21 && \
+ jupyter lab build && \
+ # remove node_modules
+ rm -rf /home/$NB_USER/.cache/yarn && \
+ rm -rf /opt/conda/share/jupyter/lab/staging/node_modules/* && \
+ # uninstall nodejs, nodejs should not be used after jupyter lab has been built
+ conda uninstall nodejs -y && \
+ # clear conda to remove index cache, lock files, unused cache packages, and tarballs in /opt/conda/pkgs
+ conda clean -a -y
+
+RUN pip --no-cache-dir install pyqlib==0.6.2
-# Add DeepFM example into notebook
-RUN cp submarine/submarine-sdk/pysubmarine/example/submarine_experiment_sdk.ipynb $HOME && \
+# Install latest sumbarine python sdk and notebook
+RUN git clone --depth=1 https://github.com/apache/submarine && \
+ # replace numpy==1.19.2 to numpy>=1.20.0
+ sed -i "s/numpy==1.19.2/numpy>=1.20.0/" submarine/submarine-sdk/pysubmarine/setup.py && \
+ pip --no-cache-dir install submarine/submarine-sdk/pysubmarine[tf,pytorch] && \
+ cp submarine/submarine-sdk/pysubmarine/example/submarine_experiment_sdk.ipynb $HOME && \
cp -r submarine/submarine-sdk/pysubmarine/example/{data,deepfm_example.ipynb,deepfm.json} $HOME && \
rm submarine -rf
-# Install latest stable qlib
-RUN conda install -c conda-forge cvxpy==1.0.21
-RUN pip install numpy==1.20.0 pyqlib==0.6.2
-
# Add qlib example in notebook
RUN wget https://raw.githubusercontent.com/microsoft/qlib/main/examples/workflow_by_code.ipynb -P $HOME
-
ENV MLFLOW_S3_ENDPOINT_URL http://submarine-minio-service:9000
ENV AWS_ACCESS_KEY_ID submarine_minio
ENV AWS_SECRET_ACCESS_KEY submarine_minio
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@submarine.apache.org
For additional commands, e-mail: dev-help@submarine.apache.org