You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by uw...@apache.org on 2018/05/17 13:50:49 UTC
[arrow] branch master updated: ARROW-2486: [C++/Python] Provide a
Docker image that contains all dependencies for development
This is an automated email from the ASF dual-hosted git repository.
uwe pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 941a1b7 ARROW-2486: [C++/Python] Provide a Docker image that contains all dependencies for development
941a1b7 is described below
commit 941a1b762c6d1ad1d86f67b5352875a00cf707d5
Author: Aneesh Karve <an...@gmail.com>
AuthorDate: Thu May 17 15:50:17 2018 +0200
ARROW-2486: [C++/Python] Provide a Docker image that contains all dependencies for development
Open items
- [x] Why is `py.test pyarrow` failing on plasma deps when script follows [docs](https://arrow.apache.org/docs/python/development.html#developing-on-linux-and-macos)?
- [x] Should `/script/*.sh` use the same code as developer docs to avoid denormalization?
- [x] Move docker image to Apache registry?
- [x] Multiple container strategy possible, but overly complex. Requires exposing volume on one container as a mount point for a second container. Only speeds up user's first build.
- [x] Are gcc/g++ 4.8 the ideal versions?
- [x] Unit tests needed?
- [x] Update README per resolution of above
Author: Aneesh Karve <an...@gmail.com>
Closes #2016 from akarve/master and squashes the following commits:
5aec17a8 <Aneesh Karve> final PR feedback; README indendtation
---
dev/container/Dockerfile | 49 ++++++++++++++++++++++
dev/container/README.md | 76 +++++++++++++++++++++++++++++++++++
dev/container/script/arrow-build.sh | 33 +++++++++++++++
dev/container/script/env.sh | 28 +++++++++++++
dev/container/script/parquet-build.sh | 32 +++++++++++++++
dev/container/script/pyarrow-build.sh | 22 ++++++++++
6 files changed, 240 insertions(+)
diff --git a/dev/container/Dockerfile b/dev/container/Dockerfile
new file mode 100644
index 0000000..de0729b
--- /dev/null
+++ b/dev/container/Dockerfile
@@ -0,0 +1,49 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+FROM ubuntu:18.04
+
+RUN apt-get update && \
+ apt-get install -y \
+ gcc-8 \
+ g++-8 \
+ vim \
+ git \
+ wget \
+ make \
+ ninja-build
+
+ENV CC=gcc-8
+ENV CXX=g++-8
+
+# Miniconda - Python 3.6, 64-bit, x86, latest
+RUN wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O mconda.sh && \
+ /bin/bash mconda.sh -b -p miniconda && \
+ rm mconda.sh
+
+ENV PATH="/miniconda/bin:$PATH"
+
+# create conda env with deps
+RUN conda create -y -q -n pyarrow-dev \
+ python=3.6 numpy six setuptools cython pandas pytest \
+ cmake flatbuffers rapidjson boost-cpp thrift-cpp snappy zlib \
+ gflags brotli jemalloc lz4-c zstd -c conda-forge \
+ && conda clean --all
+
+ADD script ./script
+RUN chmod u=rwx ./script/*.sh
+
diff --git a/dev/container/README.md b/dev/container/README.md
new file mode 100644
index 0000000..d8636e9
--- /dev/null
+++ b/dev/container/README.md
@@ -0,0 +1,76 @@
+<!---
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+# Apache Arrow development container
+* Includes all dependencies for Arrow development
+* Builds are incremental, mirrored to local file system
+* Resolves [ARROW-2486](https://issues.apache.org/jira/browse/ARROW-2486)
+
+## Get started
+
+### [Install Docker](https://docs.docker.com/install/)
+
+### Acquire image
+
+```
+$ docker pull quiltdata/arrow
+```
+
+### Populate host directory
+Keep git repos and subsequent build products in a persistent local
+directory, `/io`.
+
+```
+$ mkdir -p io/arrow
+$ git clone https://github.com/apache/arrow.git io/arrow
+$ mkdir -p io/parquet-cpp
+$ git clone https://github.com/apache/parquet-cpp.git io/parquet-cpp
+```
+Alternatively, if you wish to use existing git repos, you can nest them
+under `/io`.
+
+### Run container, mount `/io` as volume
+
+```
+$ docker run \
+ --shm-size=2g \
+ -v /LOCAL/PATH/TO/io:/io \
+ -it quiltdata/arrow
+```
+
+### Use container
+Run scripts to build executables.
+
+See also [Arrow dev docs](https://arrow.apache.org/docs/python/development.html).
+
+```
+$ source script/env.sh
+$ script/arrow-build.sh
+$ script/parquet-build.sh
+$ script/pyarrow-build.sh
+# run tests
+$ cd /io/arrow/python
+$ py.test pyarrow
+```
+
+## Build container
+
+```
+$ docker build -t USERNAME/arrow .
+```
diff --git a/dev/container/script/arrow-build.sh b/dev/container/script/arrow-build.sh
new file mode 100644
index 0000000..ab448b8
--- /dev/null
+++ b/dev/container/script/arrow-build.sh
@@ -0,0 +1,33 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# See also https://arrow.apache.org/docs/python/development.html
+mkdir -p /io/arrow/cpp/build
+pushd /io/arrow/cpp/build
+cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
+ -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
+ -DARROW_PYTHON=on \
+ -DARROW_PLASMA=on \
+ -DARROW_BUILD_TESTS=OFF \
+ -DCMAKE_CXX_FLAGS=$CXXFLAGS \
+ -GNinja \
+ ..
+ninja
+ninja install
+popd
+
diff --git a/dev/container/script/env.sh b/dev/container/script/env.sh
new file mode 100644
index 0000000..cb24424
--- /dev/null
+++ b/dev/container/script/env.sh
@@ -0,0 +1,28 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# See also https://arrow.apache.org/docs/python/development.html#build-and-test
+source activate pyarrow-dev
+export ARROW_BUILD_TYPE=release
+export ARROW_BUILD_TOOLCHAIN=$CONDA_PREFIX
+export PARQUET_BUILD_TOOLCHAIN=$CONDA_PREFIX
+export ARROW_HOME=$CONDA_PREFIX
+export PARQUET_HOME=$CONDA_PREFIX
+# For newer GCC per https://arrow.apache.org/docs/python/development.html#known-issues
+export CXXFLAGS="-D_GLIBCXX_USE_CXX11_ABI=0"
+export PYARROW_CXXFLAGS=$CXXFLAGS
diff --git a/dev/container/script/parquet-build.sh b/dev/container/script/parquet-build.sh
new file mode 100644
index 0000000..f06aa07
--- /dev/null
+++ b/dev/container/script/parquet-build.sh
@@ -0,0 +1,32 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# See also https://arrow.apache.org/docs/python/development.html#build-and-test
+mkdir -p /io/parquet-cpp/build
+pushd /io/parquet-cpp/build
+cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
+ -DCMAKE_INSTALL_PREFIX=$PARQUET_HOME \
+ -DPARQUET_BUILD_BENCHMARKS=off \
+ -DPARQUET_BUILD_EXECUTABLES=off \
+ -DPARQUET_BUILD_TESTS=off \
+ -DCMAKE_CXX_FLAGS=$CXXFLAGS \
+ -GNinja \
+ ..
+ninja
+ninja install
+popd
diff --git a/dev/container/script/pyarrow-build.sh b/dev/container/script/pyarrow-build.sh
new file mode 100644
index 0000000..927ff79
--- /dev/null
+++ b/dev/container/script/pyarrow-build.sh
@@ -0,0 +1,22 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# See also https://arrow.apache.org/docs/python/development.html#build-and-test
+cd /io/arrow/python
+python setup.py build_ext --build-type=$ARROW_BUILD_TYPE \
+ --with-parquet --with-plasma --inplace
--
To stop receiving notification emails like this one, please contact
uwe@apache.org.